POLYNUCLEOTIDES FOR USE AS TAGS AND TAG COMPLEMENTS IN THE DETECTION OF 

NUCLEIC ACID SEQUENCES 

FIELD OF THE INVENTION 

This invention relates to the use of families of oligonucleotides for 
use as tags, for example, in the sorting of molecules, identification of " 
target nucleic acid molecules or for analyzing the presence of a mutation or 
polymorphism at a locus of each target nucleic acid molecule. 

BACKGROUND OF THE INVENTION 

Single-nucleotide polymorphisms (SNPs) are the most common form of 
genetic polymorphism. This, coupled with their potential as functional 
variants, has produced a great deal of interest in SNPs both as 
pharmacogenetic indicators and as markers for mapping genes for complex 
diseases. A« large number of SNPs have already been identified with >21,000 
entries on the NCBI's SNP database alone. Many recent studies have focused 
on identifying polymorphisms that lie in the coding sequence of potential 
candidate genes for common diseases. The ability to genotype this abundant 
source of variation rapidly and accurately is becoming an ever more important 
goal in the genetics community. A variety of technologies have the potential 
to transfer to high-throughput genotyping laboratories. These include 5' 
exonuclease assays, such .as TaqMan (Livak et al . 1995), molecular beacons 
(Tyagi et al . 1996), dye-labeled oligonucleotide ligation (DOL) (Chen et al . 
1998), oligonucleotide-ligation assays (OLAs) (Tobe et al. 1996), 
minisequencing (Chen and Kwok 1997; Pastinen et al . 1997), microarray 
technology (Hacia et al. 1998; Wang et al . 1998), mass spectroscopy (Ross et 
al. 1998) and the scorpions assay (Whitcombe et al . 1999). However, no 
single chemistry has gained acceptance as the technology of choice. A 
suitable method for such applications must be accurate and homogenous, 
develop a robust, easily interpretable signal, and be flexible enough to 
extend to novel foci with little optimization. These features will lend the 
technology to automation. 

Third Wave Technologies, Inc., has developed a new mutation, 
detection method referred to as the Invader Assay. The Invader Assay is 
based on a novel linear signal amplification technology that requires 
specific hybridization of two "overlapping" oligonucleotides and subsequent 
recognition and cleavage of this structure by the Cleavase enzyme. Cleavases 
are bacterial enzymes that cleave unpaired DNA strands or "flaps" near a 



nick, for instance when the 5' end of a sequence is displaced by the 3' end 
of an elongating upstream oligonucleotide. Enzymes with this so-called flap 
endonuclease activity typically excise, the redundant 5' "flap" of the 
downstream oligonucleotide, leaving a simple nick to be repaired by lipases. 
The excised "flap" is subsequently detected by one of several methods 
commonly known in the art. Cleavases have stringent requirements relative to 
the structure formed by such overlapping DNA sequences, and can be used to 
specifically detect single base pair mismatches immediately upstream of the 
cleavage site on the downstream DNA strand. Thermostable cleavages permit 
reactions to be performed at temperatures sufficiently high to promote 
turnover and consequent signal amplification without the need for temperature 
cycling. 

While the Invader Assay offers exquisite specificity, its use in the 
detection of multiple distinct target nucleic acids in a single experiment 
i.e., multiplexing, is limited. This is because if the Invader Assay is to 
be used in a high-throughput gene microarray format, the most efficient 
method of detecting the excised "flap" sequence is to capture the sequence by 
hybridization to its complementary nucleic acid sequence attached to a solid 
phase support. 

Working in a highly parallel hybridization environment requiring 
.specific hybridization imposes very rigorous selection criteria for the 
design of families of oligonucleotides that are to be used. The success of 
these approaches is dependent on the specific hybridization of a probe and 
its complement. Problems arise as the family of nucleic acid molecules 
cross-hybridise or hybridise incorrectly to the target sequences. While it 
is common to obtain incorrect hybridization resulting in false positives or 
an inability to form hybrids resulting in false negatives, the frequency of 
such results must be minimized. In order to achieve this goal certain 
thermodynamic properties of forming nucleic acid hybrids must be considered. 
The temperature at which oligonucleotides form duplexes with their 
complementary sequences known as the T m (the temperature at which 50% of the 
nucleic acid duplex is dissociated) varies according to a number of sequence 
dependent properties including the hydrogen bonding energies of the canonical 
pairs A-T and G-C (reflected in GC or base composition) , stacking free energy 
and, to a lesser extent, nearest neighbour interactions. These energies vary 
widely among oligonucleotides that are typically used in hybridization 
assays. For example, hybridization of two probe sequences composed of 24 
nucleotides, one with a 40% GC content and the other with a 60% GC content, 



with its complementary target under standard conditions theoretically may 
have a 10°C difference 1 in melting temperature (Mueller et al., Current 
Protocols in Mol. Biol.; 15, 5:1993). Problems in hybridization occur when 
the hybrids are allowed to form under hybridization conditions that include a 
single hybridization temperature that is not optimal for correct 
hybridization of all oligonucleotide sequences of a set. Mismatch 
hybridization of non-complementary probes can occur forming duplexes with 
measurable mismatch stability (Santalucia et al., Biochemistry; 38: 3468-77, 
1999). Mismatching of "duplexes in a particular set of oligonucleotides can 
occur under hybridization conditions where the mismatch result's in a decrease 
in duplex stability that results in a higher T m than the least stable correct 
duplex of that particular set. For example, if hybridization is carried out 
under conditions that favor the AT-rich perfect match duplex sequence, the 
possibility exists for hybridizing a GC-rich duplex sequence that contains a 
mismatched base having a melting temperature that is still above the 
correctly formed AT-rich duplex. Therefore, design of families of 
oligonucleotide sequences that can be used in multiplexed hybridization 
reactions must include consideration for the thermodynamic properties of 
oligonucleotides and duplex formation that will reduce or eliminate cross 
hybridization behavior within the designed oligonucleotide set. 

The development of such families of tags has been attempted over the 
years with varying degrees of success. There are a number of different 
approaches for selecting sequences for use in multiplexed hybridization 
assays. The selection of sequences that can be used as zipcodes or tags in 
an addressable array has been described in the patent literature in an 
approach taken by Brenner and co-workers. United States Patent No. 5,654,413 
describes a population of oligonucleotide. tags (and corresponding tag 
complements) in which each oligonucleotide tag includes a plurality of 
subunits, each subunit consisting of an oligonucleotide having a length of 
from three to six nucleotides and -each subunit being selected from a 
minimally cross hybridizing set, wherein a subunit of the set would have at 
least two mismatches with any other sequence of the set. Table II of the 
Brenner patent specification describes exemplary groups of 4mer subunits that 
are minimally cross hybridizing according to the aforementioned criteria. In 
the approach taken by Brenner, constructing non cross-hybridizing 
oligonucleotides, relies on the use of subunits that form a duplex having at 
least two mismatches with the complement of any other subunit of the same 



set. The ordering of subunits in the construction of oligonucleotide tags is 
not specifically defined. 

Parameters used in the design of tags based on subunits are discussed 
in Barany et al. (WO 9731256) . For example, in the design of polynucleotide 
sequences that are for example 24 nucleotides in length (24mer) derived from 
a set of four possible tetramers in which each 24mer "address" differs from 
its nearest 24mer neighbour by 3 tetramers. They discuss further that, if 
each tetramer differs from each other by at least two nucleotides, then each 
24'mer will differ from the next by at least six nucleotides. This is 
determined without consideration for insertions or deletions when forming the 
alignment between any two sequences of the set. In this way a unique "zip 
code" sequence is generated. The zip code is ligated to a label in a target 
dependent manner, resulting in a unique "zip code" which is then allowed to 
hybridise to its address on the chip. To minimise cross-hybridisation of a 
"zip code" to other "addresses", the hybridization reaction is carried out at 
temperatures of 75-80°C. Due to the high temperature conditions for 
hybridization, 24mers that have partial homology hybridise to a lesser extent 
than sequences with perfect complementarity and represent 'dead zones' . This 
approach of implementing stringent hybridization conditions for example, 
involving high temperature hybridization, is also practiced by Brenner et. 
al. 

The current state of technology for designing non-cross hybridizing 
tags based on subunits does not provide sufficient guidance to construct a 
family of relatively large numbers of sequences with practical value in 
assays that require . stringent non-cross hybridizing behavior.. 

Thus, while it is desirable to have, at once in a gene microarray 
format, a large number of "flap" molecules incorporated into the Invader 
Assay, the "flap" molecules should each be highly selective for its own 
complement sequence. While such an array provides the advantage that the 
family of molecules making up the grid is entirely of design, and does not 
rely on sequences as they occur in nature, the provision of a family of 
molecules, which is. sufficiently large and where each individual member is 
sufficiently selective for its complement over all the other zipcode 
molecules (i.e., where there is sufficiently low cross-hybridization, or • ■•■ 
cross-talk) continues to elude researchers. 



SUMMARY OF THE INVENTION 

The present invention relates to the use of one set of 210 and a second 
set of 1168 minimally cross-hybridizing oligonucleotide sequences for use in 
the Invader Assay. The incorporation of these sequences into one of the two 
probes, and subsequent structure dependent cleavage of the minimally cross- 
hybridizing sequences upon hybridization to the target nucleic acid molecule 
enables the Invader Assay to be used in the analysis of multiple gene 
sequences on a gene microarray. 

Using the method of Benight et al. a family of 100 sequences was 
obtained using a computer algorithm to have optimal hybridization properties 
for use in nucleic acid detection assays. The sequence set of 100 
oligonucleotides was characterized in hybridization assays, demonstrating the 
ability of family members to correctly hybridize to their complementary 
sequences with an absence of cross hybridization. These are the sequences 
having SEQ ID NOs : 1 to 100 of Table I. This set of sequences has been 
expanded to include an additional 110 sequences that can be grouped with the 
original 100 sequences as having non-cross hybridizing properties, based on 
the characteristics of the original set of 100 sequences. These additional 
sequences are identified as SEQ ID NOs: 101 to 210 of the sequences in Table 
I. How these sequences were obtained is described below. 

Variant families of sequences (seen as tags or tag complements) of a 
family of sequences taken from Table I are also part of the invention. For 
the purposes of discussion, families of tag complements will be described. 

A family of complements is obtained from a set of oligonucleotides 
based on a family of oligonucleotides such as those of Table I. For 
illustrative purposes, providing a family of complements based on the 
oligonucleotides of Table I will be described. 

Firstly, the groups of. sequences based on the oligonucleotides of Table 

I can be represented as follows: 

Table IA: Numeric sequences corresponding to 
word patterns of a set of 
oligonucleotides 



Sequence Numeric Pattern 
Identifier 

1 1 4 6 6 1 3 

2 2.4 5 5 2 3 

3 1 8 1 2 3 4 

4 1 7 1 9 8 4 

5 1 1 9 2 6 9 

6 1 2 4 3 9 6 

7 9 8 9 8 10 9 

8 9 1 2 3 8 10 
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word patterns of a set of 
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Here, each of the numerals 1 to 22 (numeric identifiers) represents a 4mer 
and the pattern of numerals 1 to 10 of the sequences in the above list 
corresponds to the pattern of tetrameric oligonucleotide segments present in 
the oligonucleotides of Table I, which oligonucleotides have been found to be 
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non-cross-hybridizing, as described further in the detailed examples. Each 
4mer is selected from the group of 4mers consisting of WWWW, WWWX, -WWWY, 
WWXW, WWXX, WWXY, WWYW, WWYX, WWYY, WXWW, WXWX, WXWY, WXXW, WXXX, WXXY, WXYW, 
WXYX, WXYY, WYWW, WYWX, WYWY, WYXW, WYXX, WYXY, WYYW, WYYX, WYYY, XWWW, XWWX, 
5 XWWY, XWXW, XWXX, XWXY, XWYW, XWYX, XWYY, XXWW, XXWX, XXWY, XXXW, XXXX, XXX Y, 
XXYW, XX YX, XXYY, XYWW, XYWX, XYWY, XYXW, XYXX, XYXY, XYYW, XYYX, XYYY, YWWW, 
YWWX, YWWY, YWXW, YWXX, YWXY, YWYW, YWYX, YWYY, YXWW, YXWX, YXWY, YXXW, YXXX f 
YXXY, YXYW, YXYX, YXYY, YYWW, YYWX, YYWY, YYXW, YYXX, YYXY, YYYW, ' YYYX, and 
YYYY. Here W, X and Y represent nucleotide bases, A, G, C, etc., the 

10 assignment of bases being made according to rules described below. 

Given this numeric pattern, a 4mer is assigned to a numeral. For 
example, 1 = WXYY, 2 = YWXY, etc. Once a given 4mer has been assigned to a 
given numeral, it is not assigned for use in the position of a different 
numeral. It is possible, however, to assign a different 4mer to the same 

15 numeral. That is, for example, the numeral 1 in one position could be 
assigned WXYY and another numeral 1, in a different position, could be 
assigned XXXW, but none of the other numerals 2 to 10 can then be assigned 
WXYY or XXXW. A different way of saying this is that each ofl'to 10 is 
assigned a 4mer from the list of eighty-one 4mers ^indicated so as to be 

20 different from all of the others of 1 to 10. 

In the case of the specific oligonucleotides given in Table I, 1 = 
WXYY, 2 = YWXY, 3 = XXXW, 4 = YWYX, 5 = WYXY, 6 = YYWX, 7 = YWXX, 8 = WYXX, 9 
= XYYW, 10 = XYWX, 11 = YYXW, 12 - WYYX, 13 = XYXW, 14 - WYYY, 15 = WXYW, 16 
= WYXW, 17 = WXXW, 18 = WYYW, 19 = XYYX, 20 = YXYX, 21 = YXXY and 22 = XYXY . 

25 Once the 4mers are assigned to positions according to the above 

pattern, a particular set of oligonucleotides can be created by appropriate 
assignment of bases, A, T/U, G, C to W, X, Y. These assignments are made 
according to one of the following two sets of rules: 

(i) Each of W, X and Y is a base in which: 

(a) W = one of A, T/U, G, and C, 
X = one of A, T/U, G, and C, 
Y = one of A, T/U, G, and C, 

and each of W, X and Y is selected so as to be different 
from all of the others of W, X and Y, 

(b) an unselected said base of (i) (a) can be substituted any 
number of times for any one of W, X and Y. 
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or 

(ii) Each of W, X and Y is a base in which: 

(a) W = G or C, 

X = A or T/U, 
Y = A or T/U, 
and X * Y, and 

(b) a base not selected in (ii) (a) can be inserted into each 
sequence at one or more locations , the location of each 
insertion being the same in each sequence as that of every o 
sequence of the set; 

In the case of the specific oligonucleotides given in Table I, W = G, X 
5 = A and Y = T. 

In any case, given a set of oligonucleotides generated according to one 
of these sets of rules, it is possible to modify the members of a given set 
in relatively minor ways and thereby obtain a different set of sequences 
while more or less maintaining the cross-hybridization properties of the set 

10 subject to such modification. In particular, it is possible to insert up to 
3 of A, T/U, G and C at any location of any sequence of the set of sequences. 
Alternatively, or additionally, up to 3 bases can be deleted from any 
sequence of the set of sequences . 

A person skilled in the art would understand that given a set of 

15 oligonucleotides having a set of properties making it suitable for use as a 
family of tags (or tag complements) one can obtain another family with the 
same property by reversing the order of all of the members of the set. In 
other words, all the members can be taken to be read 5 f to 3 1 or to be read 
3' to 5' . 

20 A family of complements of the present invention is based on a given 

set of oligonucleotides defined as described above. Each complement of the 
family is based on a different oligonucleotide of the set and each complement 
contains at least 10 consecutive (i.e., contiguous) bases of the 
oligonucleotide on which it is based. For a given family of complements 

25 where one is seeking to reduce or minimize inter-sequence similarity that 

would result in cross-hybridization, each and every pair of complements meets 
particular homology requirements. Particularly, subject to limited 
exceptions, described below, any two complements within a set of complements 
are generally required to have a defined amount of dissimilarity. 
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In order to notionally understand these requirements for dissimilarity 
as they exist for a given pair of complements of a family, a phantom sequence 
is generated from the pair of complements. A "phantom" sequence is a single 
sequence that is generated from a pair of complements by selection, from each 
complement of the pair, of a string of bases wherein the bases of the string 
occur in the same order in both complements. An object of creating such a 
phantom sequence is to create a convenient and objective means of comparing 
the sequence identity of the two parent sequences from which the phantom 
sequence is created. 

A phantom sequence may thus be generated from exemplary Sequence 1 and 
Sequence 2 as follows: 

Sequence 1: ATGTTTAGTGAAAAGTTAGTATTG 

* m 



Sequence 2 : ATGTTAGTGAATAGTATAGTATTG 

♦ 

Phantom Sequence: ATGTTAGTGAAAGTTAGTATTG 

The phantom sequence generated from these two sequences is thus 22 
bases in length. That is, one can see that there are 22 identical bases with 
identical sequence (the same order) in Sequence Nos. 1 and 2'. There is a 
total of three insertions/deletions and mismatches present in the phantom 
sequence when compared with the sequences from which it was generated: 

ATGT-TAGTGAA-AGT-TAGTATTG 

The dashed lines in this latter representation of the phantom sequence 
indicate the locations of the insertions/deletions and mismatches in the 
phantom sequence relative to the parent sequences from which it was derived. 
Thus, the "T" marked with an asterisk in Sequence 1> the "A" marked with a 
diamond in Sequence 2 and the "A-T" mismatch of Sequences 1 and 2 marked with 
two dots were deleted in generating the phantom sequence. 

A person skilled in the art will appreciate that the term 
"insertion/deletion" is intended to cover the situations indicated by the 



asterisk and diamond. Whether the change is considered, strictly speaking, 
an insertion or deletion is merely one of vantage point. That is, one can 
see that the fourth base of Sequence 1 can be deleted therefrom to obtain the 
phantom sequence, or a "T" can be inserted after the third base of the 
phantom sequence to obtain Sequence 1 . 

One can thus see that if it were possible to create a phantom sequence 
by elimination of a single insertion/deletion from- one of the parent 
sequences, that the two parent sequences would have identical homology over 
the length of the phantom sequence except for the presence of a single base 
in one of the two sequences being compared. Likewise, one can see that if it 
were possible to create a phantom sequence through deletion of a mismatched 
pair of bases, one base in each parent, that the two parent sequences would 
have identical homology over the length of the phantom sequence except for 
the presence of a single base in each of the sequences being compared. For 
this reason, the effect of an insertion/deletion is considered equivalent to 
the effect of a mismatched pair of bases when comparing the homology of two 
sequences. 

Once a phantom sequence is generated, the compatibility of the pair of 
complements from which it was generated within a family of complements can be 
systematically evaluated: 

According to one embodiment of the invention, a pair of complements is 
compatible for inclusion within a family of complements if any phantom 
sequence generated from the pair of complements has the following properties: 

Any consecutive sequence of bases in the phantom sequence which is 
identical to a consecutive sequence of bases in each of the first and 
second complements from which it is generated is no more ( (3/4 x L) - 1) 
bases in length; 

The phantom sequence, if greater than or equal to (5/6 x L) in length, 
contains at least 3 insertions/deletions or mismatches when compared to t 
first and second complements from which it is generated; and 

The phantom sequence, is not greater than or equal to (11/12 x L) in 
length. 

Here, L x is the length of the first complement, L 2 is the length of the 
second complement, and L = L x , or if Li * L 2 , L is the greater of Li and L 2 . 
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In particular preferred embodiments of the invention, all pairs of 
complements of a given set have the properties set out above. Under 
particular circumstances , it may be advantageous to have a limited number of 
complements that do not meet all of these requirements when compared to every 
other complement in a family. 

In one case, for any first complement there are at most two second 
complements in the family which do not meet all of- the three listed 
requirements. For two such complements, there would thus be a greater chance 
of cross-hybridization between their tag counterparts and the first 
complement. In another case, for any first complement there is at most one 
second complement which does not meet all of three listed requirements. 

It is also possible, given this invention, to design a family of 
complements where a specific number or specific portion of the complements do 
not meet the three listed requirements. For example, a set could be designed 
where only one pair of complements within the set do not meet the 
requirements when compared to each other. There could be two pairs, three 
pairs, and any number of pairs up to and including all possible pairs. 
Alternatively, it may be advantageous to have a given proportion of pairs of 
complements that do not meet the requirements, say 10% of pairs, when 
compared with other sequences that do not meet one or more of the three 
requirements listed. This number could instead by 5%, 15%, 20%, 25%, 30%, . 
35%, or 40%. 

The foregoing comparisons would generally be largely carried out using 
appropriate computer software. Although notionally described in terms of a 
phantom sequence for the sake of clarity and understanding, it will be 
understood that a competent computer programmer can carry out pairwise 
comparisons of complements in any number of ways using logical steps that 
obtain equivalent results. 

The symbols A, G, T/U, C take on their usual meaning in the art here. 
In the case of T and U, a person skilled in the art would understand that 
these are equivalent to each other with respect to the inter-strand hydrogen- 
bond (Watson-Crick) binding properties at work in the context of this 
invention. The two bases are thus interchangeable and hence the designation 
of T/U. 

Analogues of the naturally occurring bases can be inserted in their 
respective places where desired. Analogues can be defined as any non-natural 
base, such as peptide nucleic acids and the like. 
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Other aspects of the invention are described below, particularly 
numbered paragraphs at the end of this specification. 

In another broad embodiment A family of 1168 sequences was obtained 
using a computer algorithm to have desirable hybridization properties for use 
in nucleic acid detection assays. The sequence set of 1168 oligonucleotides 
was partially characterized in hybridization assays, demonstrating the 
ability of family members to correctly hybridize to their complementary 
sequences with minimal cross hybridization. These are the sequences having 
SEQ ID NOs:l to 1168 of Table II. 

Variant families of sequences (seen as tags or tag complements) of a 
family of sequences taken from Table II are also part of the invention. For 
the purposes of discussion, a family or set of oligonucleotides . will ■ of ten be 
described as a family of tag complements, but it will be understood that such 
a set could just easily be a family of tags. 

A family of complements is obtained from a set of oligonucleotides 
based on a family of oligonucleotides such as those of Table II. To simplify 
discussion, providing a family of complements based on the oligonucleotides 
of Table II will be described. 

Firstly, the groups of sequences based on the oligonucleotides of Table 

II can be represented as shown in Table IIA. 

Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 
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Table IIA: Numeric sequences corresponding to nucleotide 



patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
Identifier 

32 1 2123211 12232212322313 23 

3112232 1' 2111321221312323 24 

213123131221132322122231 25 

322113222322212321213113 ' 26 

313212213211132312123121 27 

323112312221321112312231 28 

312231132212131112312213 29 

1323121 1 1232 2 13223112232 30 

212121321112322232323223 31 

221132322132212223223213 32 

32132 1 121231132313112121 33 

213232121311232131222132 34 

223213122131232322232111 35 

2132121313213 13123121222 36 

122323111311131311311122 37 

2323131122113122 11311232 38 

121221322113111311313223 39 

223213223131112123213222 40 

213132232211131323211121 41 

322123123232121132112123 42 

222322131123131131222123 43 

132121322211131132132131 44 

323131212131222131113211 45 

223222121323113231121321 46 

113211321321121323232211 47 

122232313221231113121131 48 

311132131311211131211311 49 

122211312232211313213113 50 

322211131223211311232321 51 

• 2 22323113123113212223212 52 

23232221311 2 2 21321232321 53 

3 12112312212131113232223 54 

322122232113221131213213 55 

132221223111313222311213 56 

223232221223232132221113 57 

122323131131212311132212 58 

231311. 2 32111311232221223 59 

123231113221231232211223 60 

322213212213223221131223 61 

312231212223112322232223 62 

231122311132321123223212 63 

312232122322313112131121 64 

111222313122232312131321 65 

321122131222322232232232 66 

322232122322132311212132 67 

123213213213123222 12 3 112 68 

23222111312312231131112 3 69 

232312112312 322122232321 70 

121322323131122232112213 71 

121312321131311122323111 72 

131221131311322112131321 73 

311321112232311231113111 74 

11232113 1.1 13113122322321 75 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 
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2 


3 


1 


3 


2 


3 


1 


1 


1 


3 


95 


3 


1 


2 


1 


3 


1 


2 


2 


2 


1 


3 


1 


1 


2 


3 


1 


1 


2 


2 


1 


1 


3 


2 


3 


96 


2 


2 


2 


3 


1 


1 


3 


1 


1 


3 


1 


3 


1 


2 


2 


2 


3 


1 


1 


1 


2 


2 


3 


1 


97 


1 


2 


3 


1 


1 


2 


1 


1 


3 


1 


3 


2 


2 


3 


1 


2 


1 


1 


1 


2 


3 


2 


3 


1 


98 


2 


3 


2 


2 


2 


1 


2 


3 


2 


1 


3 


2 


3 


2 


1 


3 


1 


2 


2 


3 


1 


1 


2 


2 


99 


2 


2 


2 


1 


1 


3 


2 


3 


1 


3 


2 


2 


1 


2 


1 


3 


1 


1 


3 


2 


1 


3 


2 


1 


100 


3 


1 


2 


2 


2 


1 


2 


3 


2 


3 


2 


2 


2 


3 


1 


1 


3 


2 


2 


1 


1 


3 


1 


2 


101 


2 


1 


3 


2 


2 


1 


3 


1 


3 


1 


1 


1 


3 


2 


3 


1 


2 


1 


1 


1 


3 


2 


2 


1 


102 


3 


2 


1 


1 


2 


3 


1 


2 


1 


1 


2 


3 


1 


1 


3 


2 


3 


2 


1 


2 


1 


2 


1 


3 


103 


1 


1 


2 


3 


1 


1 


3 


2 


3 


2 


2 


1 


3 


2 


1 


2 


1 


3 


1 


2 


1 


3 


2 


1 


104 


2 


1 


1 


1 


2 


2 


3 


1 


3 


2 


2 


2 


3 


2 


2 


2 


3 


1 


2 


2 


3 


2 


1 


3 


105 


2 


1 


1 


2 


3 


1 


1 


3 


1 


1 


2 


1 


1 


3 


2 


1 


2 


3 


1 


3 


2 


3 


2 


2 


106 


1 


1 


1 


2 


3 


2 


1 


1 


2 


1 


3 


2 


3 


2 


2 


3 


2 


2 


1 


3 


2 


2 


1 


3 


107 


1 


3 


1 


3 


2 


2 


1 


3 


2 


3 


1 


1 


1 


2 


3 


2 


2 


3 


2 


2 


1 


1 


1 


2 


108 


3 


1 


1 


1 


2 


1 


3 


1 


1 


1 


2 


3 


2 


1 


2 


2 


3 


2 


2 


2 


3 


2 


3 


1 


109 


1 


3 


2 


2 


1 


2 


1 


1 


3 


2 


2 


2 


3 


2 


3 


1 


3 


1 


1 


2 


2 


1 


1 


3 


110 


3 


1 


3 


2 


2 


2 


1 


2 


1 


3 


2 


2 


1 


3 


1 


1 


2 


1 


2 


3 


2 


2 


3 


2 


111 


1 


3 


1 


3 


2 


2 


1 


2 


2 


1 


3 


1 


1 


3 


1 


1 


3 


1 


2 


2 


2 


1 


1 


3 


112 


3 


1 


3 


2 


2 


1 


1 


2 


3 


1 


1 


1 


2 


1 


1 


3 


2 


1 


2 


2 


2 


3 


2 


3 


113 


1 


2 


-3 


1 


2 


3 


1 


1 


2 


1 


3 


2 


2 


3 


1 


1 


3 


2 


1 


2 


1 


2 


1 


3 


114 


1 


2 


1 


3 


1 


2 


1 


2 


3 


1 


3 


1 


2 


3 


1 


1 


1 


3 


2 


2 


1 


3 


2 


1 


115 


2 


1 


2 


3 


2 


1 


1 


1 


3 


1 


1 


1 


3 


2 


3 


1 


1 


1 


3 


1 


1 


3 


1 


1 


116 


2 


3 


1 


1 


2 


3 


2 


1 


3 


1 


1 


1 


2 


3 


1 


1 


2 


3 


2 


2 


3 


1 


1 


1 


117 


1 


1 


2 


2 


3 


1 


1 


2 


1 


3 


2 


3 


2 


3 


2 


3 


1 


3 


2 


2 


2 


1 


1 


2 


118 


1 


3 


1 


2 


1 


2 


2 


3 


2 


2 


2 


3 


1 


2 


2 


1 


1 


2 


3 


1 


1 


3 


1 


3 


119 


1 


1 


1 


3 


2 


2 


3 


2 


1 


1 


1 


3 


2 


2 


3 


1 


1 


3 


1 


2 


1 


1 


1 


3 


120 


3 


2 


2 


1 


1 


3 


1 


3 


1 


2 


2 


1 


2 


3 


1 


3 


1 


2 


3 


2 


1 


2 


2 


1 


121 


1 


3 


1 


1 


3 


1 


2 


1 


2 


1 


1 


3 


1 


1 


3 


1 


2 


2 


3 


1 


1 


2 


2 


3 


122 


3 


2 


1 


3 


1 


1 


1 


2 


2 


2 


3 


1 


1 


2 


2 


3 


1 


2 


3 


2 


3 


1 


1 


1 


123 


1 


. 1 


3 


1 


3 


2 


1 


3 


1 


2 


2 


3 


1 


2 


1 


1 


3 


2 


1 


2 


1 


2 


3 


1 


124 


2 


3 


1 


2 


1 


2 


1 


3 


2 


1 


3 


2 


3 


1 


1 


3 


1 


1 


1 


2 


1 


1 


3 


2 


125 


1 


3 


1 


2 


1 


1 


2 


3 


1 


2 


3 


1 


3 


1 


1 


1 


2 


3 


1 


1 


3 


1 


2 


1 


126 


1 


2 


3 


2 


3 


1 


1 


1 


3 


2 


1 


2 


2 


2 


3 


2 


3 


1 


2 


1 


2 


1 


3 


2 


127 


1 


1 


2 


1 


1 


3 


1 


3 


1 


1 


2 


2 


3 


1 


2 


1 


2 


3 


1 


1 


3 


1 


2 


3 


128 



18 



Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
Identifier 



z 


1 


1 


3 


2 


3 


o 

Z 


1 


2 


2 


2 


1 


3 


2 


1 


3 


1 


1 


2 


3 


1 


1 


3 


2 


129 


o 
Z 


i 




3 


2 


2 


1 


3 


1 


2 


2 


2 


3 


2 


2 


3 


1 


3 


1 


2 


2 


3 


1 


2 


130 


1 

1 


3 


2 


2 


2 


3 


2 


1 


2 


3 


1 


1 


3 


1 


3 


1 


2 


1 


3 


2 


1 


2 


2 


2 " 


131 


3 


1 


3 


1 


1 


1 


2 


3 


2 


2 


1 


2 


3 


2 


1 


2 


2 


2 


1 


3 


2 


1 


3 


2 


132 


2 


1 


2 


3 


2 


-> 
3 


1 


3 


1 


1 


2 


3 


2 


3 


2 


2 


2 


3 


1 


2 


2 


2 


1 


1 


133 


3 


2 


1 


2 


3 


2 


2 


2 


3 


2 


2 


2 


1 


2 


1 


3 


1 


1 


2 


3 


2 


1 


2 


3 


134 


J 


1 


3 


2 


1 


2 


1 


2 


1 


3 


1 


1 


3 


1 


1 


1 


3 


1 


1 


1 


2 


2 


2 


3 


135 


1 


2 


3 


1 


3 


2 


3 


1 


1 


3 


2 


1 


1 


1 


2 


3 


2 


1 


3 


2 


2 


1' 


2 


2 


136 


2 


2 


1 


1 


3 


1 


1 


3 


2 


3 


1 


3 


2 


2 


1 


2 


2 


3 


2 


3 


1 


2 


1 


2 


137 


1 


2 


3 


1 


1 


1 


2 


3 


1 


3 


1 


1 


2 


1 


2 


2 


3 


2 


2 


3 


2 


2 


2 


3 


138 


3 


1 


2 


2 


1 


1 


2 


3 


1 


2 


2 


1 


2 


3 


2 


3 


1 


1 


2 


2 


3 


1 


2 


3 


139 


3 


1 


1 


1 


2 


3 


2 


2 


1 


1 


1 


3 


1 


2 


1 


2 


3 


1 


1 


1 


3 


2 


1 


3 


140 


z 


1 


2 


2 


3 


2 


2 


3 


1 


2 


2 


2 


3 


1 


2 


1 


2 


2 


1 


3 


2 


3 


2 


3 


141 


2 


2 


2 


1 


2 


3 


2 


2 


2 


3 


2 


3 


2 


1 


2 


3 


2 


1 


1 


3 


2 


1 


3 


2 


142 


1 


1 


2 


2 


3 


1 


1 


1 


3 


1 


1 


2 


2 


3 


2 


3 


2 


3 


1 


1 


2 


2 


3 


1 


143 


2 


3 


1 


3 


2 


2 


2 


3 


1 


1 


2 


2 


2 


3 


2 


2 


2 


3 


1 


3 


2 


1 


1 


2 


144 


3 


1 


2 


3 


2 


1 


2 


1 


1 


2 


3 


1 


2 


3 


2 


3 


2 


3 


2 


1 


1 


1 


2 


2 


145 


1 


2 


3 


2 


3 


1 


3 


1 


3 


1 


1 


3 


1 


1 


2 


2 


2 


3 


2 


2 


2 


1 


2 


2 


146 


3 


2 


3 


1 


2 


1 


1 


1 


3 


2 


1 


2 


2 


3 


2 


2 


3 


1 


2 


1 


3 


1 


1 


1 


147 


3 


1 


1 


3 


2 


1 


3 


1 


1 


2 


1 


3 


1 


1 


1 


3 


2 


2 


1 


1 


2 


1 


3 


1 


148 


2 


2 


3 


2 


3 


2 


1 


3 


2 


2 


1 


1 


3 


1 


3 


2 


2 


3 


2 


2 


2 


1 


1 


2 


149 


2 


1 


3 


2 


1 


3 


2 


1 


1 


3 


2 


2 


3 


2 


2 


1 


3 


1 


1 


2 


1 


3 


2 


2 


150 


1 


1 


2 


2 


2 


3 


1 


1 


3 


2 


1 


2 


1 


1 


2 


3 


1 


1 


2 


3 


2 


3 


2 


3" 


151 


2 


1 


3 


1 


1 


1 


2 


2 


3 


2 


1 


3 


2 


1 


2 


2 


2 


3 


1 


3 


1 


3 


1 


1 


152 


2 


3 


2 


1 


2 


1 


2 


3 


2 


2 


1 


1 


2 


3 


1 


3 


1 


2 


3 


2 


2 


3 


2 


1 


153 


2 


1 


2 


2 


2 


3 


1 


2 


1 


1 


3 


1 


3 


1 


1 


2 


3 


1 


1 


3 


1 


1 


3 


2 


154 


2 


2 


3 


1 


1 


2 


1 


3 


2 


3 


2 


1 


1 


2 


3 


1 


1 


2 


1 


2 


3 


1 


2 


3 


155 


3 


2 


1 


3 


2 


2 


2 


3 


2 


3 


1 


1 


2 


1 


3 


1 


1 


2 


2 


1 


3 


2 


2 


2 


156 


1 


1 


1 


3 


1 


2 


3 


1 


2 


2 


3 


2 


1 


1 


2 


2 


2 


3 


2 


3 


2 


3 


1 


1 


157 


3 


1 


1 


3 


1 


2 


2 


3 


2 


2 


3 


1 


3 


2 


2 


1 


1 


2 


1 


3 


1 


2 


1 


1 


158 


1 


3 


1 


2 


2 


1 


2 


3 


2 


1 


3 


2 


3 


1 


2 


3 


2 


1 


1 


1 


2 


3 


2 


2 


159 


3 


1 . 


J. 


2 


2 


2 


1 


3 


1 


2 


3 


2 


1 


3 


1 


2 


1 


2 


3 


1 


1 


2 


3 


2 


160 


3 


1 


2 


1 


3 


1 


1 


3 


2 


3 


2 


1 


2 


2 


1 


1 


3 


2 


1 


1 


3 


2 


2 


1 


161 


2 


1 


2 


3 


1 


1 


2 


2 


1 


2 


3 


1 


3 


1 


1 


3 


1 


1 


2 


1 


3 


1 


3 


2 


162 


2 


2 


2 


3 


2 


2 


1 


2 


3 


1 


1 


3 


2 


3 


1 


2 


2 


2 


3 


2 


2 


2 


3 


2 


163 


3 


2 


1 


1 


1 


3 


1 


2 


2 


3 


2 


3 


2 


2 


1 


2 


1 


2 


3 


1 


1 


1 


2 


3 


164 


2 


2 


3 


2 


3 


1 


2 


1 


3 


2 


1 


3 


2 


2 


1 


3 


1 


2 


1 


2 


2 


2 


3 


2 


165 


3 


1 


1 


2 


2 


1 


1 


3 


1 


2 


1 


1 


1 


3 


1 


1 


3 


1 


3 


1 


1 


3 


2 


1 


166 


3 


1 


2 


2 


3 


2 


1 


3 


1 


1 


2 


3 


1 


1 


2 


2 


2 


3 


2 


1 


3 


2 


1 


2 


167 


1 


1 


1 


2 


1 


1 


3 


1 


3 


1 


3 


1 


3 


1 


1 


2 


3 


1 


2 


2 


2 


1 


3 


2 


168 


1 


1 


2 


2 


1 


2 


3 


2 


3 


1 


1 


2 


1 


3 


1 


2 


2 


3 


2 


2 


3 


1 


1 


3 


169 


2 


2 


1 


1 


3 


1 


2 


2 


2 


1 


2 


3 


2 


3 


1 


2 


1 


3 


2 


1 


3 


1 


3 


2 


170 


2 


2 


1 


1 


1 


3 


1 


2 


1 


3 


2 


3 


2 


2 


2 


3 


2 


2 


3 


2 


3 


2 


2 


1 


171 


2 


1 


2 


2 


3 


1 


2 


2 


2 


1 


2 


3 


1 


1 


3 


1 


3 


2 


1 


2 


1 


3 


2 


3 


172 


1 


1 


1 


2 


2 


2 


3 


1 


2 


3 


1 


3 


2 


1 


3 


2 


2 


2 


1 


1 


3 


1 


3 


1 


173 


1 


2 


1 


1 


1 


3 


2 


2 


3 


2 


2 


2 


3 


1 


2 


3 


2 


2 


2 


3 


1 


1 


2 


3 


174 


3 


1 


2 


2 


3 


2 


3 


1 


2 


3 


1 


1 


2 


1 


1 


2 


3 


2 


2 


1 


2 


2 


3 


1 


175 


3 


1 


2 


3 


1 


1 


3 


1 


1 


1 


2 


1 


2 


3 


1 


2 


1 


2 


3 


1 


1 


2 


1 


3 


176 


2 


2 


1 


1 


1 


3 


2 


2 


1 


2 


2 


3 


1 


1 


3 


2 


3 


1 


1 


3 


2 


2 


3 


1 


177 


2 


2 


3 


2 


1 


1 


3 


I 


1 


1 


2 


1 


3 


1 


3 


1 


2 


2 


2 


3 


2 


3 


2 


2 


178 


3 


1 


3 


1 


2 


2 


3 


1 


3 


2 


2 


2 


1 


1 


3 


2 


1 


2 


2 


1 


3 


1 


2 


2 


179 


1 


3 


2 


3 


1 


2 


1 


1 


2 


1 


3 


1 


1 


2 


3 


1 


2 


1 


1 


1 


2 


3 


2 


3 


180 


3 


1 


2 


1 


1 


2 


1 


3 


2 


3 


1 


1 


2 


2 


2 


3 


1 


3 


2 


2 


3 


2 


1 


2 


181 



19 



Table IXA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 



1 


3 


1 


2 


1 


2 


2 


2 


3 


2 


1 


3 


2 


1 


3 


1 


1 


1 


3 


2 


1 


2 


3 


2 


182 


3 


2 


2 


1 


2 


3 


1 


1 


2 


3 


2 


2 


3 


1 


1 


2 


2 


2 


3 


1 


1 


2 


3 


2 


183 


1 


2 


3 


1 


1 


1 


3 


1 


2 


2 


2 


1 


3 


2 


2 


3 


2 


3 


1 


3 


1 


2 


1 


2 


184 


1 


1 


1 


2 


1 


3 


1 


3 


1 


1 


3 


2 


2 


1 


2 


3 


1 


2 


3 


2 


3 


1 


2 


1 


185 


2 


2 


1 


3 


2 


3 


1 


3 


1 


1 


1 


2 


3 


2 


2 


2 


1 


1 


2 


3 


2 


3 


1 


2 


186 


2 


3 


1 


1 


3 


1 


1 


2 


1 


2 


3 


2-' 


3 


1 


1 


1 


2 


2 


1 


3 


2 


2 


2 


3 


187 


3 


2 


2 


2 


3 


1 


2 


1 


3 


2 


2 


2 


1 


1 


2 


3 


1 


3 


2 


1 


2 


2 


3 


1 


188 


3 


2 


2 


3 


2 


1 


1 


3 


2 


1 


1 


2 


3 


1 


2 


1 


1 


1 


3 


2 


1 


2 


3 


1. 


189 


2 


1 


1 


3 


1 


3 


2 


1 


3 


2 


1 


1 


2 


2 


3 


2 


2 


3 


2 


2 


2 


1 


3 


1 


190 


2 


2 


2 


3 


1 


3 


1 


3 


1 


3 


2 


1 


2 


3 


2 


1 


2 


3 


1 


2 


2 


1 


2 


2 


191 


1 


2 


2 


3 


1 


2 


2 


3 


2 


3 


1 


1 


2 


2 


1 


3 


1 


2 


1 


3 


1 


1 


3 


1 


192 


3 


1 


2 


2 


1 


3 


2 


1 


2 


2 


2 


1 


3 


2 


1 


3 


2 


1 


1 


2 


1 


3 


1 


3 


193 


2 


1 


2 


3 


2 


1 


2 


2 


1 


3 


1 


3 


1 


2 


1 


2 


2 


3 


1 


1 


1 


3 


2 


3 


194 


2 


1 


2 


3 


2 


3 


1 


1 


1 


3 


2 


1 


1 


2 


3 


1 


2 


1 


1 


1 


2 


3 


1 


3 


195 


3 


2 


1 


1 


2 


2 


1 


3 


2 


1 


1 


2 


3 


1 


2 


2 


2 


3 


1 


1 


2 


3 


1 


3 


196 


3 


2 


2 


2 


1 


2 


2 


3 


2 


1 


1 


1 


3 


1 


2 


3 


2 


1 


1 


3 


2 


3 


1 


1 


• 197 


2 


1 


3 


2 


1 


3 


1 


1 


2 


2 


3 


2 


2 


3 


2 


2 


1 


1 


1 


3 


1 


1 


2 


3 


' 198 


2 


1 


2 


2 


3 


2 


2 


1 


3 


2 


2 


1 


2 


3 


2 


1 


3 


2 


3 


2 


3 


2 


1 


1 


199 


3 


1 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 
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Table IIA: Numeric sequences corresponding to nucleotide 
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Table IIA: Numeric sequences corresponding to nucleotide 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier. 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
Identifier 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 
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2 
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2 
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3 


2 


3 
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2 
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3 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 



1 


1 


2 


1 


1 


3 


1 


2 


2 


3 


1 


1 


2 


2 


1 


3 


2 


"3 


1 


3 


2 


1 


1 


3 


1083 


1 


1 


2 


3 


2 


2 


2 


3 


1 


3 


1 


3 


1 


2 


2 


2 


1 


3 


2 


1 


1 


1 


3 


1 


1084 


1 


3 


2 


2 


2 
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1 


2 


1 


3 


1 


1 


1 


2 


3 


2 


3 


2 


2 


2 


3 


1 
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2 
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3 
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3 


1 


3 


1 


1 


1- 


2 


1 


2 
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2 
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2 
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3 
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3 


1 


2 


2 


3 


1 


2 


1 
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2 


3 


2 


2 


3 
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1 
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2 


1 


1 


3 
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2 


2 


3 


2 


1 
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2 


2 


2 


3 


1 


3 


2 


3 


2 


2 


1 


3 


2 


2 


1 


2 


2 


1134 


2 


3 


1 


1 


2 
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3 


1 


2 


1 


3 


2 


2 
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2 
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3 


1135 
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Table IIA: Numeric sequences corresponding to nucleotide 
patterns of a set of oligonucleotides 

Numeric Pattern Sequence 

Identifier 



2 


3 


1 


2 


1 


3 


2 


1 


2 


3 


2 


2 


2 


3 


2 


3 


1 


2 


o 
Z 


1 


1 


1 


3 


1 


1136 


3 


1 


2 


3 


2 


1 


2 


1 


1 


1 


3 


1 




2 


1 


2 


3 


o 
Z 


Z 


1 


2 


1 


1 


3 


1137 


1 


3 


2 


3 


1 


3 


1 


2 


2 


2 


1 


3 


1 


1 


3 


1 


2 


3 


2- 


Z 


1 


2 


2 


1 


1138 


1 


2 


3 


1 


3 


1 


1 


2 


2 


2 


3 


2 


2 


1 


1 


1 


3 


1 


3 


1 


1 


1 


3 


2 


1139 


1 


1 


1 


3 


1 


1 


2 


2 


1 


3 


2 


1 


2 


3 


1 


2 


1 


3 


1 


2 


3 


1 


3 


1 


1140 


2 


1 


3 


1 


3 


2 


2 


3 


2 


1 


2 


1 


3 


2 


2 


2 


1 


2 


1 


3 


2 


2 


3 


1 


1141 


3 


2 


1 


3 


1 


1 


2 


3 


1 


2 


2 


3 


2 


2 


2 


1 


3 


1 


1 


3 


1 


2 


2 


2 


1142 


3 


2 


2 


2 


1 


2 


3 


2. 


2 


2 


3 


1 


3 


1 


1 


3 


1 


3 


2 


2 


1 


2 


2 


2 


1143 


2 


1 


3 


1 


1 


3 


2 


2 


2 


3 


1 


1 


1 


3 


2 


2 


1, 


2 


2 


3 


1 


2 


2 


3 . 


1144 


3 


1 


2 


3 


1 


1 


3 


1 


3 
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In Table IIA, each of the numerals 1 to 3 (numeric identifiers) represents a 
nucleotide base and the pattern of numerals 1 to 3 of the sequences in the 
above list corresponds to the pattern of nucleotide bases present in the 
oligonucleotides of Table II, which oligonucleotides have been found to be 
non-cross-hybridizing, as described further in the detailed. examples . Each 
nucleotide base is selected from the group of nucleotide bases consisting of 
A, C, G, and T/U. A particularly preferred, embodiment of the invention, in 
which a specific base is assigned to each numeric identifier is shown in 
Table II, below. * 

In one broad aspect, the invention is a composition comprising 
molecules for use as tags or tag complements wherein each molecule comprises 
an oligonucleotide selected from a set of oligonucleotides based on a group 
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of sequences as specified by numeric identifiers set out in Table IIA^ In the 
sequences, each of 1 to- 3 is a nucleotide base selected to be different from the 
others of 1 to 3 with the proviso that up to three .nucleotide bases of each 
sequence can be . substituted with any nucleotide base provided that: 
for any pair of sequences of the set: 

Ml < 16, M2 < 13, M3 ^ 20, M4 $ 16, and M5 < 19, where: 

Ml is the maximum number of matches for any alignment in which there are no 
internal indels; 

M2 is the maximum length of a block of matches for any alignment; 

M3 is the' maximum number of matches for any alignment having a maximum score; 

M4 is the maximum sum of the lengths of the longest two blocks- of matches 

for 

any alignment of maximum score; and 

M5 is the maximum sum of the lengths of all the blocks of matches having a 
length of at least 3, for any alignment of maximum score; wherein: 

the score of an alignment is determined according to the equation 
(A x in) - (B x mm) - (C x (og + eg) )■ - (D x eg) ) , wherein: 
for each of- (i) to (iv) : 

( i ) - m - 6 , mm = 6 , og = 0 and eg = 6 , 

(ii) m = 6, mm = 6, og = 5 and eg ~ l t 

(iii) m = 6, mm = 2 , og = 5 and eg = 1, and 

(iv) m = 6, mm - 6., og = 6 and eg = 0, 

A is the total number of matched pairs of bases in the 
alignment; 

B is the total number of internal mismatched pairs in the 
alignment; 

C is the total number of internal gaps in the alignment; and . 
D is the. total number of internal indels in the alignment minus 
the total number of internal gaps in the alignment; and 
wherein the maximum score is determined separately for each of (i) , 
(ii) , (iii) and (iv) . 

An explanation of the meaning of the parameters set out above is given in the 
section describing detailed embodiments. 

In another broad aspect, the invention is a composition containing 
molecules for use. as tags or tag complements wherein each molecule comprises 
an oligonucleotide selected from a set of oligonucleotides based on a group 
of sequences as set out in Table IIA wherein each of 1 to 3 is a nucleotide 
base selected to be different from the others of 1 to 3 with the proviso that .... ^ 
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up to three nucleotide bases of each sequence can be substituted with any 
nucleotide base provided that: 

for any pair of sequences of the set: 

Ml < 19, M2 < 17, M3 £'21, M4 < 18, and M5 <> 20, where: 
Ml is the maximum number of matches for any alignment in. which there are 
no internal indels; 

M2 is the maximum length of a* block of matches for any alignment; 

M3 is the maximum number of matches for any alignment having a maximum 

score; 

M4 is the maximum sum of the lengths of the longest two- blocks of 
matches for any alignment of maximum score; and 

M5 is the maximum sum of the lengths of all the blocks of matches 
having a length of at least 3, for any alignment of maximum score; 
wherein 

the score of an alignment is determined according to the equation 
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A. is the total number of matched pairs, of bases in the alignment; 
B is the total number of internal mismatched pairs, in the 
alignment; 

C is the total number of internal gaps in the alignment; and 
D is the total number of internal indels in the alignment minus 
the total* number of internal gaps in the alignment; and 

wherein the maximum score is determined separately for each of (i) , (ii) , 

(iii) and (iv) . 

In another broad aspect, the invention is a composition comprising 
molecules for use as tags or tag complements wherein each molecule comprises 
an oligonucleotide selected from a set of oligonucleotides based on a group 
of sequences set out "in Table IIA wherein each of 1 to 3 is a nucleotide base 
selected to be different, from the others of 1 to 3 with the proviso that up 
to three nucleotide bases of each sequence can be substituted with- any 
nucleotide base provided that: 

for any pair of sequences of the set: 

Ml ^ 19, M2 < 17, M3 £ 21, M4 < 18, and M5 < 20, where: 



40 



Ml is the maximum number of matches for any alignment in which there are 
internal indels; 

M2 is the maximum length of a block of matches for any alignment; 
M3 is the maximum number of matches for any alignment- having a 
maximum score; 

M4 is the maximum sum of the lengths of the longest two blocks of 
matches for any alignment of maximum score; and 

M5 is the maximum sum of the lengths of all the blocks of matches having 
length of at least 3, for any alignment of maximum score, wherein: 
the score of an alignment is determined according . to the equation 
3A - B - 3C - D, wherein: 

A is the total number of matched pairs of bases in the alignment; 
B is the total number of internal mismatched pairs in the 
alignment; 

C is the total number, of- internal gaps in the alignment; and 
D is the total .number of internal indels in the alignment minus 
the total number of internal gaps in the alignment; and 
In preferred aspects, the invention provides a composition in which, 
for the group of 24mer sequences, in which 1 = A," 2 = T and 3 = G, under a 
defined set of conditions in which the maximum degree of hybridization 
between a sequence and any complement of a different, sequence of the group of- 
24mer sequences does not exceed 30% of the degree of hybridization between 
said sequence and its complement, for all said oligonucleotides of the 
composition, the maximum degree of hybridization between an oligonucleotide 
and a complement of any other oligonucleotide of the composition does not 
exceed 50% of the degree of hybridization of the oligonucleotide and its 
complement. ■ 

More preferably, the maximum degree of hybridization between a sequence 
and any complement of a different sequence does not exceed 30% of the degree 
of hybridization between said sequence and its complement, the degree of 
hybridization between each sequence and its complement varies by a factor of 
between 1 and up to 10, more preferably between 1 and up to^ 9, more 
preferably between 1 and up to 8, more preferably between 1 and up to 7, more 
preferably between 1 and up to 6, and more preferably between 1 and up to 5. 

It is also preferred that the maximum degree of hybridization between a 
sequence and any complement of a different sequence does not exceed 25%, more 
preferably does not exceed 20.%, more preferably does not exceed 15%, more 
preferably does not exceed 10%, more preferably does not exceed 5%. 
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Even, more preferably, the above-referenced defined set of conditions 
results in a level of hybridization that is the same as the level of 
hybridization, obtained when hybridization conditions include 0.2 M NaCl, 0.1 
M Tris, 0.08% Triton X-100, pH 8.0 at 37°C. 

In the composition, the defined set of conditions can include the group 
of 24mer sequences being covalently linked to beads . . 

In a particular preferred aspect, for the group of 24mers the maximum 
degree of hybridization between a sequence and any complement of a different 
sequence does not exceed 15% of the degree of hybridization between said 
sequence and its complement and the degree of hybridization between each 
sequence and its .complement varies by a factor of between 1 and up to; 9, and 
for all oligonucleotides of the set, the maximum degree of hybridization 
between an oligonucleotide and a complement of any other oligonucleotide of 
the set does not exceed 20% of the degree of hybridization &f the 
oligonucleotide and its complement. 

It is possible that each 1 is one of A, T/U, G and C; each 2 is one of 
A, T/U, G and C; and each 3 is one of A, T/U, G and C; and each of 1, 2 and 3 
is selected so as to be different from, all of the others of 1, 2 and 3. More 
preferably, 1 is A or T/U, 2 is A or T/U and 3 is G or C. Even more 
preferably, 1 is A, 2 is T/U, and 3 is G. 

In certain preferred composition, each of the oligonucleotides is from 
twenty-two to twenty-six bases in length, or from twenty-three to twenty- 
five, and preferably, each oligonucleotide is of the same length as every 
other said oligonucleotide. 

In a particularly preferred embodiment, each oligonucleotide is twenty- 
four bases in length. 

It is preferred that no oligonucleotide contains more than four 
contiguous bases that are identical to' each other. 

It is also preferred that the number of G' s in each oligonucleotide 
does not exceed L/4 where L is the number of bases in said sequence. 

For reasons described below, the number of G' s in each said 
oligonucleotide is preferred not to vary from the average number of G's in 
all of the oligonucleotides by more than one. Even more preferably, the 
number of G' s in each said oligonucleotide is the same as- every other said 
oligonucleotide. In the embodiment disclosed below in which oligonucleotides 
were tested, the sequence of each was twenty-four bases in length and each 
oligonucleotide contained 6 G's. 



It is also preferred that, for each nucleotide, there is at most six 
bases other than G between every pair of neighboring pairs of G's. 

Also, it is preferred that, at the 5' -end of each oligonucleotide at 
least one of the first, second, third, fourth, fifth, .sixth and seventh bases 
of ..the sequence of the oligonculeotide is a G. Similarly, it is preferred, 
at the 3' -end of each oligonucleotide that at least one of the first, second, 
third, fourth, fifth, sixth and seventh bases of the sequence of the 
oligonucleotide is a G. 

It is possible to have sequence compositions that include one hundred 
and sixty said molecules, or that include one hundred and seventy said 
molecules, or that include one hundred and eighty said molecules, or that 
include one hundred and ninety said molecules, or .that include two hundred 
said molecules, or that include two hundred and twenty said molecules, or 
that include two hundred and forty said molecules, or that include two 
hundred and sixty said molecules, or that include two hundred and eighty said 
molecules, or that include three hundred said molecules, or that include four 
.hundred said molecules, or that include five hundred said molecules, or that 
include six hundred said molecules, or that include seven hundred said 
molecules, or that include eight hundred said molecules, or that include nine 
hundred said molecules, or that include one thousand said molecules. . . 

It is possible, in certain applications, for each molecule to be linked 
to a solid phase support so as to be distinguishable from a mixture 
containing other of the molecules by hybridization to its complement. Such a 
molecule. can be linked to a defined location on a solid phase support such 
that the defined location for each molecule is different than the defined 
location for different others of the molecules. 

In certain embodiments, each solid phase support is a microparticle and 
-each said molecule is covalently linked to a different microparticle than 
each other different said molecule. 

In another broad aspect, the invention is a composition comprising a 
set of 150 molecules for use as tags or tag complements wherein each molecule 
includes an oligonucleotide haying a sequence of at least sixteen nucleotide 
bases wherein for any pair of sequences of the set: 

Ml < .19/24 x LI, M2 <> 17/24 x LI, M3 ^'21/24 x LI, M4 < 18/24 x LI, . 

M5 £ 20/24 x LI, where LI is the length of the shortest sequence of the 

pair, where: 

Ml is the maximum number of matches for any alignment of the pair of 
sequences in which there are no internal indels; 
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M2 is the maximum length of a block of matches for any alignment of the 

■ r 

pair of sequences; 

M3 is the maximum number of matches for any alignment of the pair of 
sequences having a maximum score; 

M4 is the maximum sum of the lengths of the longest two blocks of 
matches for any alignment of the pair of sequences of maximum score; and 
M5 is the maximum sum of the lengths of all the blocks of matches having 
length of at least 3, for any alignment of the pair of sequences of 
maximum score, wherein: 

the score of an alignment is determined according to the equation 
(A x m) - (B x mm) - (C x (og + eg)) - (D x eg)) , wherein: 
for each of (i) to (iv) : 

( i ) m = 6 , mm = S , og = 0 and eg = 6 , 

(ii) m = 6 , mm = 6 , og=5 and eg = 1 , 

(iii) . m = 6, mm = 2 , og - S and eg = 1, and 

(iv) m - 6 , mm - 6 , og = 6 and eg = 0, 

A is the total number of matched pairs of bases in the alignment; • 
B- is the total number of internal mismatched pairs in the 
alignment; . ^ 

C is the total number of internal gaps in the alignment; and 
D is the total number of internal indels in the .alignment minus 
the total number of internal gaps in the alignment; and 
wherein the maximum score is determined separately for each of (i) , (ii) , 
(iii) and (iv) . 

In yet another broad aspect, the invention is a composition that 
includes a set of 150 molecules for use as tags or tag complements wherein 
each molecule has an oligonucleotide having a sequence of at least sixteen 
nucleotide bases wherein for any pair of sequences of the set: 
Ml < 19, M2 < 17, M3 ^ 21, M4 < 18, and M5 ^ 20, where: 
Ml is the maximum number of matches for any alignment, of the pair of 
sequences in which there are no internal indels; 

M2 is the maximum length of a block of matches for any alignment of 
the pair of sequences; 

M3 is the maximum number of matches for any alignment of the .pair of 
sequences having a maximum score; 

M4 is the maximum sum of the lengths of the longest two blocks of matches 
for any alignment of the pair of sequences of maximum score; and 
' M5 is t-he maximum sum of the lengths of all the blocks of matches having a 
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length of at least 3/ for any alignment of the pair ^ of sequences of 
maximum score, .wherein: 

the score of a said alignment is determined according to the equation 
3A - B - 3C - D, wherein:' 

A is the total number of matched pairs of bases in the alignment; 
B is the total number of internal mismatched pairs in the 
alignment; 

C is the total number of internal gaps in the alignment; and 
D is the total number of internal indels in the alignment minus 
the total number of internal gaps in the alignment. 
.In certain embodiments of the invention, each sequence of a composition 
has up to fifty bases. More preferably, however, each sequence is between 
sixteen and forty bases in length, or between sixteen and thirty-five bases 
in length, or between eighteen and thirty bases in length, or between twenty 
5 and twenty-eight bases in length, or between twenty-one and twenty-seven 
bases in length, or between twenty-two and twenty-six bases in length. 

Often, each sequence is of the same length as every other said • 
sequence. In particular embodiments disclosed herein, ' each sequence is 
twenty-four bases in length. 
10 Again, it can be preferred that no sequence contains more than four 

contiguous bases that are identical to each other, etc., as described above. 

In certain preferred embodiments, the composition is such that, under a 
defined set of conditions, the maximum degree of hybridization between an . 
oligonucleotide and any complement of a different oligonucleotide of the 
15 composition does not exceed about 30% of the degree of hybridization between 
said oligonucleotide and its complement, more preferably 20%, more preferably 
15%, more preferably 10%, more preferably 6%. 

Preferably, the set of conditions results in a ' level of hybridization 
that is the same as the level of hybridization obtained when hybridization 
20 conditions include 0.2 M NaCl, 0.1 M Tris, .0.08% Triton X-100, pH 8 . 0 at 

37°C, and the oligonucleotides are covalently linked to microparticles . Of 
course it is possible that these specific conditions be used for determining 
the level of hybridization. 

It is also preferred that under such a defined set of conditions, the 
25 degree of hybridization between each oligonucleotide and its complement 

varies by a factor of between 1 and up to 8, more preferably up to 7, more 
preferably up to 6, more preferably up to 5 . In a particular disclosed 
embodiment, the observed variance in the degree of hybridization was a factor 



of only 5.3, i.e., the degree of hybridization between each oligonucleotide 
and its complement varied by a factor of between 1 and 5.6. 

In certain preferred embodiments, under the defined set of conditions, 
the maximum degree of hybridization between a said oligonucleotide and any 
complement of a different oligonucleotide of the composition does not exceed 
about. 15%, more preferably 10%, more preferably 6%. 

In one preferred embodiment, the set of conditions results in a level 
of hybridization that is the same as the level of hybridization obtained when 
hybridization conditions include 0.2 M NaCI, 0.1 M Tris, 0.08% .Triton X-1Q0, 
pH 8.0 at 37 °C, and the oligonucleotides are covalently linked to 
microparticles . 

Also, under the defined set of conditions, it is preferred that the 
degree of hybridization between each oligonucleotide and its complement 
varies by a factor of between 1 and up to. .8, more preferably up to 7, more 
preferably up to 6, more preferably up to 5. 

Any composition of the invention can include one hundred and sixty of 
the oligonucleotide molecules, or one hundred and seventy of the 
oligonucleotide molecules, or one hundred and eighty of the oligonucleotide 
molecules, or one hundred and ninety of the oligonucleotide molecules, or two 
hundred of the oligonucleotide molecules, or two hundred and twenty of the 
oligonucleotide molecules, or two hundred and forty of the oligonucleotide 
molecules, or two hundred and sixty of the oligonucleotide molecules, or two 
hundred and eighty of the oligonucleotide molecules, or three hundred of the 
oligonucleotide molecules, or four hundred of the oligonucleotide molecules, 
or five hundred of the oligonucleotide molecules, or six hundred of the 
oligonucleotide molecules, or seven hundred of the oligonucleotide molecules, 
or eight hundred of the oligonucleotide molecules, or nine hundred of the 
oligonucleotide molecules, or one thousand or more of the oligonucleotide 
molecules . 

A composition of the invention can be a family of tags, or it can be a 
family of tag complements. 

An oligonucleotide molecule belonging to a family of molecules of the 
invention can have incorporated thereinto one more analogues of nucleotide 
bases, preference being given those that undergo normal Watson-Crick base ■ 
pairing. c ... 

The invention includes kits for sorting and. identifying 
polynucleotides. Such a kit can include one or more solid phase supports 
each having one or more spatially discrete regions, each such region having a 



uniform population of substantially identical tag complements covalently 
attached. The tag complements are made up of a set of oligonucleotides^ of 
the invention. 

The one or more solid phase supports can be a planar substrate in which 
the one. or more spatially discrete regions is a plurality of spatially 
addressable regions. 

The tag complements can also be coupled to microparticles . 
Microparticles preferably each have a diameter in the range of from 5 to 40 
urn. 

Such a kit preferably includes microparticles that are 
spectrophotometrically unique, and therefore distinguisable from each other 
according to conventional laboratory techniques. Of course for such kits to 
work, each type of microparticle would generally have only one tag complement 
associated with it, and usually there would be a different oligonucleotide 
tag complement associated with (attached to) each type of microparticle. 

The invention includes methods of using families of oligonucleotides of 
the invention . 

One such method is of analyzing a biological sample "^containing a 
biological sequence for the presence of a mutation or polymorphism at a locus 
of the nucleic acid. The method includes: 

(A) amplifying the nucleic acid molecule in the presence of a first primer 
having a 5' -sequence having the sequence of a tag complementary to the 
sequence of a tag complement belonging to a family of tag complements 
of the invention to form an amplified molecule with a 5' -end with a 
sequence complementary to the sequence of the tag; 

(B) extending the amplified molecule in the presence of a polymerase and a 
second primer having 5' -end complementary the 3 '-end of the amplified 
sequence, with the 3' -end of the second primer extending to immediately 
adjacent said locus, in the presence of a plurality of nucleoside 
triphosphate derivatives each of which is: • (i) capable of incorporation 

'during transciption by the polymerase onto the 3' -end of a growing 
nucleotide strand; (ii) causes termination of polymerization; and (iii) 
capable of differential detection, one from the other, wherein there 
is a said derivative complementary to each possible nucleotide present 
at said locus of the amplified sequence; 
" (C) specifically hybridizing the second primer to a tag complement having 
the tag complement sequence of (A) ; and 
<D) detecting the nucleotide derivative incorporated into the second 
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primer in (B) so as to identify the base located at the locus of the 
nucleic acid. 

In another method of the invention, a biological sample containing a 
plurality of nucleic acid molecules is analyzed for the presence of a 
mutation or polymorphism at a locus of each nucleic acid molecule, for each 
nucleic acid molecule. This method includes steps of: 

(A) amplifying the nucleic acid molecule in the presence of a first primer 
having a 5' -sequence having the sequence of a tag complementary to the 
sequence of a tag complement belonging to a family of tag complements 
of the invention to form an amplified molecule with a 5' -end with a 
sequence complementary to the sequence of the tag; 

(B) extending the amplified molecule in the presence of a polymerase and a 
second primer having 5' -end complementary the 3' -end of the amplified 
sequence, the 3 '-end of the second primer extending to immediately 
adjacent said locus, in the presence of a plurality of nucleoside 
triphosphate -.derivatives each of which is: (i) capable of 
incorporation during transciption by the polymerase onto the 3' -end 

of a growing nucleotide strand; (ii) causes termination of 
polymerization; and (iii) capable of differential detection, one from 
the other, wherein there is' a said derivative complementary to each 
possible nucleotide present at said locus of the amplified molecule; 

(C) specifically hybridizing the second primer to a tag complement having, 
the tag complement sequence of (A) ; and 

(D) detecting the nucleotide derivative incorporated into the second 
primer in (B) so as to identify the base located at the locus of the 
nucleic acid; 

wherein each tag of (A) is unique for each nucleic acid molecule and steps (A) 
(B) are carried out with said nucleic molecules in the" presence of each other. 

Another method includes analyzing a biological sample that contains a 
plurality of double stranded complementary nucleic acid molecules for the 
presence of a mutation or polymorphism at a locus of each nucleic acid 
molecule, for each nucleic acid molecule. The method includes steps of: 
(A) amplifying the double stranded molecule in the presence of a pair of 
first primers, each primer having an identical 5' -sequence having the 
sequence of a tag complementary to the sequence of a tag complement 
belonging to a family of tag complements of the invention to form ' 
amplified molecules with 5' -ends with a sequence complementary to the 
sequence- of the tag; 
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(B) extending the amplified molecules in the presence of a polymerase and a 
of second primers each second primer having a 5' -end complementary a 

3' -end of the amplified sequence, the 3' -end of each said second primer 
extending to immediately adjacent said locus, in the presence of a 
plurality of nucleoside triphosphate derivatives each of which is: (i) 
capable of incorporation during transciption by the polymerase onto 
the 3' -end of a growing nucleotide strand; (ii) causes termination of 
polymerization; and (iii) capable of differential detection, one from 
the other; 

(C) specifically hybridizing each of the second primers to a tag complement 
having the tag complement sequence of (A) ; and 

(D) detecting the nucleotide derivative incorporated into the second 
primers in (B) so as to identify .the base located at said locus; 

wherein the sequence of each tag of (A) is unique for each nucleic acid 
molecule and steps (A) and <B) are carried out with said nucleic molecules 
in the presence of each other. 

In yet another aspect, the invention is a method of analyzing a 
biological sample containing a plurality of nucleic acid molecules for the 
presence of a mutation or polymorphism at a locus of each nucleic acid 
molecule, for each nucleic acid molecule, the method including steps of: 

(a) hybridizing the molecule and a primer, the primer having a 5' -sequence 
having the sequence of a tag complementary to the sequence of a tag 
complement belonging to a family of tag complements of the invention 
and a 3' -end extending to immediately adjacent the locus; 

(b) enzymatically extending the 3 '-end of the primer in the presence of a 
plurality of nucleoside triphosphate derivatives each of which is: (i) 
capable of enzymatic incorporation onto the 3' -end of a growing 
nucleotide strand; (ii) causes termination of said extension; and 
(iii) capable of differential detection, one from the other, wherein 
there is a said derivative complementary to each possible nucleotide 
present at said locus; 

(c) specifically hybridizing the extended primer formed in step (b) to a 
tag complement having the tag complement sequence of (a) ; and 

(d) detecting the nucleotide derivative .incorporated into the primer- in 
step (b) so as to identify the base located at the locus of the nucleic 
molecule; 

wherein each tag of (a) is unique for each nucleic acid molecule and steps (a) 
(b) are carried. out with said nucleic molecules in the presence of each other. 



The derivative can be a dideoxy nucleoside triphosphate. 

Each respective complement can be attached as a uniform population of 
substantially identical- complements in spacially discrete regions on one or 
more solid phase support (s) . 

Each tag complement can include a label, each such label being 
different for respective complements, and step (d) can include detecting the 
presence of the different labels for respective hybridization complexes of 
bound tags and tag complements . 

Another aspect of the invention includes a method of determining the 
presence of a target suspected of being contained in a mixture. The method 
includes the steps of: 

(i) labelling the target with a first label; 

(ii) providing a first detection moiety capable of specific binding to the 
target and including a first tag; 

(iii) exposing a sample of the mixture to the detection moiety under 
conditions suitable to permit (or cause) said specific binding of the 
molecule and target; 

(iv) * providing a family of suitable tag complements of the invention wherein 

the family contains a first tag complement having a sequence 
complementary to that of the first tag; 

(v) exposing the sample to the family. of tag complements under conditions 
suitable to permit (or cause) specific hybridization of the first tag 
and its tag complement; 

(vi) determining whether a said first detection moiety hybridized to a first. s 
tag complement is bound to a said labelled target in order to determine t. 
presence or absence of said target in the mixture. 

Preferably , the first tag complement is linked to a solid support at a 
specific location of the support and step (vi) includes detecting the 
presence of the first label at said specified location. 

Also, the first tag complement can include a second label- and step (vi) 
includes detecting the presence of the first and second labels in a 
hybridized complex of the moiety and the first tag complement. 

Further, the. target can be selected from the group consisting of 
organic molecules, antigens, proteins, polypeptides, antibodies and nucleic 
acids. The target can be an antigen and the first molecule can be an 
antibody specific for that antigen. 
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The antigen is usually a polypeptide or protein and the labelling step 
can include conjugation of fluorescent molecules, digoxigenin, biotinylation 
and the like. 

The target can be a -nucleic acid and the labelling step can include 
incorporation of fluorescent molecules, radiolabelled nucleotide, 
digoxigenin, biotinylation and the like. 

Another aspect of the invention includes detecting the presence of a 
target nucleic acid molecule using the Invader Assay, which is described in 
detail in US Patent No. 5,985,557 issued November 16, 1999, incorporated 
herein by reference. The sequences of the present invention are incorporated 
into the 3' portion of one of the two oligonucleotide probes that will 
eventually be cleaved by a Cleavase enzyme and captured by its complement 
which may be attached on a solid phase support in a microarray format. 

Another aspect of the . invention includes a method of analyzing a 
biological sample comprising a plurality of target nucleic acid molecules for 
the presence of a mutation or polymorphism at a locus of each target nucleic 
acid molecule using the Invader Assay. Again, the sequences of the present 
invention are incorporated into the 3' portion of one of the two probes that 
will eventually be cleaved by a Cleavase enzyme and detected by using the 
cleaved sequence' s complement, which may be attached on a solid phase support 
such as in a microarray format. 

Another aspect of the invention incorporates the use of a second target 
nucleic acid sequence, wherein the second target nucleic acid sequence 
comprises a synthetic nucleic acid. The synthetic nucleic acid may further 
comprise at least one hairpin loop. The construction and use of such nucleic 
acid sequences with hairpin loops has been described in detail in US Patent 
No. 5,770,365 issued June 23, 1998 and International Publication WO 
01/94625A2 published December 13, 2001 : 

The present invention capitalizes on the exquisite specificity of the 
Invader Assay and the minimally cross-hybridizing sequences of the present 
invention such that simultaneous use of multiple hybridization probes in a 
single experiment is now possible. The methods and compositions of the 
present invention allow for accurate and homogenous genotyping of a' plurality 
of distinct nucleic acid in a single experiment. -The methods and . 
compositions of. the present invention are flexible enough to extend to novel 
loci with little optimization, the features of both the Invader Assay and 
the sequences of the present invention lend the technology to automation. 
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DETAILED DESCRIPTION OF THE INVENTION 
FIGURES 

Reference, is made* to the attached figures in which, ■ 
Figures 1A and IB illustrate results obtained in the cross- 
5 hybridization experiments described in Example 1. Figure 1A shows the 

hybridization pattern found when a microarray containing all 100 probes (SEQ 
ID NOs : 1 to 100 of Table I) was hybridized with a 24mer oligonucleotide 
having the complementary sequence" to SEQ ID NO : 3 of Table I (target). Figure 
IB shows the pattern observed when a similar array was hybridized with a mix 
10 of all 100 targets, i.e., oligonucleotides having the sequences complementary 
to SEQ ID NOs:l to 100 of Table 1. . 

Figure 2 shows the intensity of the signal (MFI) for each perfectly 
matched sequence (indicated in Table I) and its complement obtained as 
described in Example 3. 
15 Figure 3 is a three dimensional representation showing cross- 

hybridization observed for the sequences of Figure 2 as described in 
Example 3 . The results shown in Figure 2 are reproduced along the 
diagonal of the drawing. 

Figure 4 is illustrative of results obtained for an individual target 
20 (SEQ ID NO: 23 of Table I, target No. 16) when exposed to the 100 probes, of 
Example 3. . The MFI for each bead is plotted. 

Figure 5 illustrates generally the steps followed to obtain' a family of 
sequences of the present invention; 

Figure 6 shows the intensity of the signal (MFI) for each perfectly 
25 matched sequence (probe sequence indicated in Table II) and its complement 
(target at 50 fmol) obtained as described in Example 4; 

Figure 7 is a three dimensional representation showing cross- 
hybridization observed for the sequences of Figure 6 as described in Example 
4. The results shown in Figure 6 are reproduced along the diagonal of the 
30 drawing; 

Figure 8 is illustrative of the results obtained for an individual 
target (Table II, SEQ ID No: 90, target No. 90) when exposed to the 100 
probes of Example 4. . The MFI for each bead is plotted. 

35 DETAILED DESCRIPTION OF THE INVENTION 

The invention, provides a method for sorting complex 
mixtures. of molecules by the use of families of oligonucleotide 
sequence tags. The families of oligonucleotide sequence tags are 
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designed so as to provide minimal cross hybridization during the 
sorting process. Thus any sequence- within a family of sequences will 
not cross hybridize with any other sequence derived from that family 
under appropriate hybridization conditions known by those skilled in 
the art. The invention is particularly useful in highly parallel 
processing of .analytes. 

Families of Oligonucleotide Sequence Tags 

The present invention includes a family of 24mer polynucleotides , that 
have been demonstrated to be minimally cross-hybridizing with each other. 
This family of polynucleotides is thus useful as a family of tags, and their 
complements as tag complements . 

The oligonucleotide sequences that belong to families of 
sequences that do not exhibit cross hybridization behavior can be 
derived by computer programs (described in United States Provisional 
Patent Application No. 60/181,563 filed February 10, 2000). The 
programs use a method of generating a maximum number of minimally 
cross-hybridizing polynucleotide sequences that can be summarized as " 
follows. First, a set of sequences of a given length are created based 
on a given number of block elements. Thus, if a family of 
polynucleotide seiquences 24 nucleotides (24mer) in length is desired 
from a set of 6 block elements, each element comprising 4 nucleotides, 
then a family of 24mers is' generated considering all positions of the 6 
block elements. In this case, there will be 6 6 (46,656) ways of 
assembling the 6 block elements to generate all possible polynucleotide 
sequences 24 nucleotides in length. 

Constraints are imposed on the sequences and are expressed 
as a set of rules on the identities of the blocks such that homology 
between any two sequences will not exceed the degree of homology 
desired between these two sequences. All polynucleotide sequences 
generated which obey the rules are saved. Sequence .comparisons are 
performed in order to generate an incidence matrix. The incidence 
matrix is presented as a simple graph and the sequences with the 
desired property of being minimally cross hybridizing are found from a 
clique of the-simple. .graph, . which may have multiple cliques. Once a 
clique containing a suitably large number of 'sequences is found, the 
sequences are experimentally tested to determine if it is a set of 
minimally cross hybridizing sequences. This method has been used to 
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obtain the 100 non cross-hybridizing tags of Table I that' are the subject 
of Example 1 . 

The method includes a rational approach to the selection of groups 
of sequences that are used to describe the blocks. For example there are n 4 
5 different tetramers that can be obtained from n different nucleotides, 
non-standard bases or analogues thereof. In a more preferred embodiment 
there are 4 4 or 25.6 possible tetramers when natural nucleotides are used. 
More- preferably 81 possible tetramers when only 3 bases are used A, T and 
G. Most preferably 32 different tetramers when all sequences have only one 
10 G. 

Block sequences can be composed of a subset of' natural bases most 
preferably A, T and G. Sequences derived from blocks that are deficient, 
in one base possess useful characteristics, for example, in reducing 
potential secondary structure formation or reduced potential for cross 

15 hybridization with nucleic acids in nature. Sets of block sequences that 
are most preferable in constructing families of non cross hybridizing tag 
sequences should contribute approximately equivalent stability to the 
formation of the correct duplex as all other block sequences of the set. 
This should provide tag sequences that behave isothermally . This can be 

20 achieved, for example, by maintaining a constant base composition for all 
block sequences such as one G and three A' s or T' s for each block 
sequence. Preferably, non-cross hybridizing sets of block sequences will 
be comprised from blocks of sequences that are isothermal. The block 
sequences should be different from each other by at least one mismatch. 

25 Guidance for selecting such sequences is provided by methods for selecting 
primer and or probe sequences that can be found in published techniques 
(Robertson et al. , Methods Mol Biol ; 98 : 121-54 (1998); Rychlik et al, 
Nucleic Acids Research, 17:8543-8551 (1989); Breslauer et al . , Proc Natl 
Acad Sci., 83:3746-3750 (1986)) and the like. Additional sets of sequences 

30 can be designed by extrapolating on the original family of non cross 

hybridizing sequences by simple methods known to those skilled in the art. 

A preferred family of 100 tags is shown as SEQ ID NOs : 1 to 100 in 
Table I. Characterization of the family of 100 sequence tags was 
performed to determine the ability of these sequences to form specific 

35 duplex structures with their complementary sequences and to assess the 

potential for cross hybridization. The 100 sequences were synthesized and 
spotted onto 
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glass slides where they were coupled to the surface by amine linkage. . 
Complementary tag sequences were Cy3-labeled and hybridized individually to 
the array containing the family of 100 sequence tags. Formation of duplex 
structures was detected and quantified for each of the positions on the 
array. Each of, the tag sequences performed as expected, that is the perfect 
match duplex was formed in the absence of significant cross hybridization 
under stringent hybridization conditions. The results of a sample 
hybridization are shown in Figure 1. Figure la shows the hybridization 
pattern seen when a microarray containing all 100 probes was hybridized with 
the target complementary to probe 181234. The 4 sets of paired spots 
correspond to the .probe complementary^ to the target. Figure lb shows the 
pattern seen when a similar array was hybridized with a mix of all 100 
targets. These results indicate that the family of sequences which is the 
subject of this patent can be used as a family of non-cross hybridizing (tag) 
sequences. 

The family of 100 non-cross-hybridizing sequences can be expanded by 
incorporating additional tetramer sequences that are used in constructing 
further 24mer oligonucleotides. In one example, four additional words were 
included in the generation of new sequences to be considered for inclusion as 
non-cross talkers in a family of sequences that were obtained from the above 
method using 10 tetramers. In this case, the four additional words were 
selected to avoid potential homologies with all potential combinations of 
other words: YYXW (TTAG) ; WYYX (GTTA) ; XYXW (ATAG) and WYYY (GTTT) . The 
total number of sequences containing six words using the 14 possible words is 
14 6 or 7,529,536. These sequences were screened to eliminate sequences that 
contain repetitive regions that present potential hybridization problems such 
as four or more of a similar base (e.g., AAAA or TTTT) or pairs of G's. Each 
of these sequences was compared to the sequence set of the original family of 
100 non-cross-hybridizing sequences (SEQ ID NOs : 1 to 100) . Any new sequence 
that contained a minimal threshold of homology (that does not include the use 
of insertions or deletions) such as 15 or more matches with any of the 
original family of sequences was eliminated. In other words, if it was 
possible to align a new sequence with one or more of the original 100 
sequences so as to obtain a maximum simple homology of 15/24 or more, the new 
sequence was dropped. "Simple homology" between- a pair of sequences is 
defined here as the number of pairs of nucleotides that are matching (are the 
same as each other) in a comparison of two aligned sequences dtvided by the 
total number of potential matches. "Maximum simple homology" is obtained 
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when two sequences are aligned with each other so as to -have the maximum 
number of paired matching nucleotides. In any event, the set of new 
sequences so obtained was referred to as the " candidate sequences". One of 
the candidate sequences was arbitrarily chosen and referred to. as sequence 
101. All the candidate sequences were checked against sequence 10 1, and 
sequences that contained 15 or more non-consecutive matches (i.e., a maximum 
simple homology of 15/24 (62.5%) or more were eliminated. This results in a 
smaller set of candidate sequences from which another sequence is selected 
that is now referred to as sequence 102. The smaller set of candidate 
sequences is now compared to sequence 102 eliminating sequences that 
contained 15 or more non-consecutive matches and the process is repeated 
until there are no candidate sequences remaining. Also, any sequence selected 
from the candidate sequences is eliminated if it has 13 or more consecutive 
matches with any other previously selected candidate sequence. 

The additional set of 73 tag sequences so obtained (SEQ ID NOs: 101 to 173 
of Table 1) is composed of sequences that when compared to any of SEQ ID NOs : 1 
to 100 of Table I have no greater similarity than the sequences of the original 
100 sequence tags of Table I. The sequence set as derived from the original 
family of non cross hybridizing sequences, SEQ ID NOs : 1 to 173 of Table 1, are. 
expected to behave, with similar hybridization properties to the sequences having 
SEQ ID N0s:l to 100 since it is understood that sequence similarity correlates 
directly with cross hybridization (Southern et al . , Nat. Genet.; 21, 5-9: 1999). 

The set of 173 24mer oligonucleotides were expanded to include those 
having SEQ ID N0s:174 to 210 as follows. The 4mers WXYW, XYXW, WXXW, WYYW, 
XYYX, YXYX, YXXY and XYXY where W=G, X=A, and Y=U/T were used in combination 
with the fourteen 4mers used in the generation of SEQ ID NOs : 1 to 173 to 
generate potential 24-base oligonucleotides. Excluded from the set were 
those containing the sequence patterns GG, AAAA and TTTT. To be included in 
the set of additional 24mers, a sequence also had to have at least one of 
the 4mers containing two G's: WXYW (GATG) , WYXW (GTAG) , WXXW (GAAG) , WYYW 
(GTTG) while also containing exactly six G's. Also required for a 24mer to 
be included was that there be at most six bases between every neighboring 
pair of G's. Another way of putting this is that there are at most six non- 
G's between any two G's. Also, each G nearest. the 5' -end of its 
oligonucleotide (the left-hand side as written in Table I) was required to 
occupy one of the first to seventh positions (counting the 5'- 
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terminal position as the first position.) A set of candidate sequences 
was' obtained by eliminating any new sequence that was found to have a 
maximum simple homology' of 16/24 or more with any of the previous set of 
173 oligonucleotides (Table 1, SEQ ID NOs : 1 to .173). As above, an 
5 arbitrary 174 th sequence was chosen and candidate sequences eliminated by 
comparison therewith. In this case the permitted maximum degree of simple 
homology was 16/24. A second sequence was also eliminated if there were 
ten consecutive matches between the two (i.e., it was notionally possible 
to generate a phantom sequence containing a sequence of 10 bases. that is 

10 identical to a sequence in each' of the sequences being compared) . A 
second sequence was also eliminated if it was possible to generate a 
phantom sequence 2 0 bases in length or greater. 

A property of the polynucleotide sequences shown in Table I is that the 
maximum block homology between any two sequences is never greater than 66 2/3 

15 percent. This is because the computer algorithm by which the sequences were 
initially generated was designed to prevent such an occurrence. It is within 
the capability of a person skilled in the art, given the family of sequences of 
Table I, to modify the sequences, or add other sequences while largely retaining 
the property of minimal -cross hybridization which the polynucleotides of Table I 

20 have been demonstrated to have. * 

There are 210 polynucleotide sequences given in Table I. Since all 210 of 
this family of polynucleotides can work with each other as a minimally cross - 
hybridizing set, then any plurality of polynucleotides that is a subset of the 
210 can also act as a minimally cross -hybridizing set of polynucleotides. An 

25 application in which, for example, 30 molecules are to be sorted using a family 
of polynucleotide tags and tag complements could thus use any group of 3 0 
sequences shown in Table I. This is not to say that some subsets may be found 
in practical sense to be more preferred than others. For example, it may be 
found that a particular subset is more tolerant of a wider variety of conditions 

30 under which hybridization is conducted before the degree of cross -hybridization 
becomes unacceptable. 

It may be desirable to use polynucleotides that are shorter in length than 
the 24 bases of those in Table I. A family of subsequences (i.e., subframes of 
the sequences illustrated) based on those contained in Table I having as few as 

35 10 bases per sequence could be chosen, so long as the subsequences are chosen to 
retain homological properties between any two of the sequences of the family 
important to their non cross -hybridization . 
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The selection of sequences using this approach would be amenable to a 
computerized process. Thus for example, a string of 10 contiguous bases of the 
first 24mer of Table I could be selected: GATTTGTATT GATTGAGATTAAAG . 

A string of contiguous bases from the second 24mer could then be selected 
and compared for maximum homology against the first chosen sequence: 
TGATTGTAGTATGT ATTGATAAAG 

Systematic pairwise comparison could then be carried out to determine if 
the maximum homology requirement of 66 2/3 percent is violated: 

Alignment Matches 

GATTTGTATT 1 
ATTGATAAAG 

GATTTGTATT . 0 

ATTGATAAAG 

GATTTGTATT 1 
ATTGATAAAG 

GATTTGTATT 1 
ATTGATAAAG 

GATTTGTATT 1 
ATTGATAAAG 

GATTTGTATT 1 
ATTGATAAAG 

GATTTGTATT .3 
ATTGATAAAG 

GATTTGTATT 1 
ATTGATAAAG . . 

GATTTGTATT 2 
ATTGATAAAG 
GATTTGTATT 2 
ATTGATAAAG 

GATTTGTATT 5 (*) 

ATTGATAAAG 
GATTTGTATT 3 

ATTGATAAAG 
GATTTGTATT 3 

ATTGATAAAG 
GATTTGTATT 2 

ATTGATAAAG 
GATTTGTATT t 1 

ATTGATAAAG 
GATTTGTATT 1 

ATTGATAAAG 
GATTTGTATT 3 

ATTGATAAAG 
GATTTGTATT 1 

ATTGATAAAG 
GATTTGTATT 0 
ATTGATAAAG 
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As can be seen, the maximum homology between the two selected 
subsequences is 50 percent (5 matches out of the total length of 10) , and so 
these two sequences are compatible with each other. 

A lOmer subsequence can be selected from the third 24mer sequence of 
Table I, and pairwise compared to each of the first two lOmer sequences to 
determine its compatability therewith, etc. and in this way a family of lOmer 
sequences developed. 

It is within the scope of this invention, to obtain families of 
sequences containing llmer, 12mer, 13mer, 14mer, 15mer, 16mer, 17mer, 18mer, 
19mer, 20mer, 21mer, 22mer and 23mer sequences by analogy to that shown for 
lOmer sequences. 

It may be desirable to have a family of sequences in which there are 
sequences greater in length than the 24mer sequences shown in Table I. It is 
within the capability of a person skilled in the art, given the family of 
sequences shown in Table I, to obtain such a family of sequences. One 
possible approach would be to insert into each sequence at one or more 
locations a nucleotide, non natural base or analogue such that the longer 
sequence" should not have greater similarity than any two of the original non 
cross hybridizing sequences of Table I and the addition of extra bases to the 
tag sequences should not result in a major change in the thermodynamic 
properties of the tag sequences of that set for example the GC content must 
be maintained between 10%-40% with a variance from the average of 20%-. This 
method of inserting bases could be used to obtain a family of sequences up to 
40 bases long. 

Given a particular family of sequences that can be used as a family of 
tags (or tag complements), e.g., those of Table I or Table II , or the 
combined sequences of these two tables, a skilled person will readily 
recognize variant families that work equally as well. 

Again taking the sequences of Table I for example, every T could be 
converted to an A and vice versa and no significant change in the cross- 
hybridization properties would be expected to be observed. This would also 
be true if every G were converted to a C. 

Also, all of the sequences of a family could be taken to be constructed 
in the 5' -3' direction, as is the convention, or all of the constructions of 
sequences could be in the opposition direction (3'-5'). 

There are additional modifications that can be carried out. For 
example, C has not been used in the family of sequences. Substitution of C 
in place of one or more T's of a particular sequence would yield a sequence 
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that is at least as low in homology with every other sequence of the family 
as the particular sequence chosen to be modified was. It is thus possible to 
substitute C in place of one or more T's in any of the sequences shown in 
Table I. Analogously, substituting of C in place of one or more A' s is 
possible, or substituting C in place of one or T's is possible. 

It is preferred that the sequences of a given family are of the same, 
or roughly the same length. Preferably, all the sequences of a family of 
sequences of this invention have a length that is within five bases of the 
base-length of the average of the family. More preferably, all sequences are 
within four bases of the average base-length. Even more preferably, all or 
almost all sequences are within three bases of the average base-length of the 
family. Better still, all or almost all sequences have a length that is 
within two of the base-length of the average of the family. 

It is also possible for a person skilled in the art to derive sets of 
sequences from the family of sequences that. is the subject- of this patent and 
remove sequences that would be expected to have undesirable, hybridization 
properties. 

Methods For Synthesis Of Oligonucleotide Families 

Preferably oligonucleotide sequences of the invention are 
synthesized directly by standard phosphoramidite synthesis approaches 
and the like (Caruthers et al, Methods in Enzymology; 154, 287-313: 
1987; Lipshutz et al, Nature Genet.; 21, 20-24: 1999; Fodor et al, 
Science; 251, 763-773: 1991). Alternative chemistries involving non 
natural bases such as peptide nucleic acids or modified nucleosides 
that offer advantages in duplex stability may also be used (Hacia et 
al;' Nucleic Acids Res ;27: 4034-4039, 1999; Nguyen -et al, Nucleic 
Acids Res.;27, 1492-1498: 1999; Weiler et al, Nucleic Acids Res.; 25, 
2792-2799:1997). It is also possible to synthesize the oligonucleotide 
sequences of this invention with alternate nucleotide backbones such as 
phosphorothioate or phosphoroamidate nucleotides. Methods involving 
synthesis through the addition of blocks of sequence in a step wise 
manner may also be employed (Lyttle et al, Biotechniques , 19: 274-280 
(1995) . Synthesis may be carried out directly on the substrate to be 
used as a solid phase support for the application or the 
oligonucleotide can be cleaved from the support for use in solution or 
coupling to a second support. 
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Solid Phase Supports 

There are several different solid phase supports that can be used with 
the invention. They include but are not limited to slides, plates, chips, 
membranes, beads, microparticles and the like. The solid phase supports can 
also vary in the materials that they are composed of including plastic, 
glass, silicon, nylon, polystyrene, silica gel, latex and the like. The 
surface of the support is coated with the complementary sequence of the same. 

In preferred embodiments,' the family of tag complement sequences are 
derivatized" to allow binding to a solid support. Many methods of 
derivatizing a nucleic acid for binding to a solid support are known in the 
art (Hermanson G., Bioconjugate Techniques; Acad. Press: 1996). The sequence 
tag may be bound to a solid support through covalent or non-covalent bonds 
(Iannone et al, Cytometry; 39: 131-140, 2000; Matson et al, Anal. Biochem.; 
224: 110-106, 1995; Proudnikov et al, Anal Biochem; 259: 34-41, 1998; 
Zammatteo et al, Analytical Biochemistry; 280:143-150, 2000). The sequence 
tag can be conveniently derivatized for binding to a solid support by 
incorporating modified nucleic acids in the terminal 5' or 3' locations. 

A variety of moieties useful for. binding to a solid support (e.g., 
bibtin, antibodies, and the like) , and methods for attaching them to nucleic 
acids, are known in the art. For example, an amine-modif ied nucleic acid 
base (available from, eg., Glen Research) may be attached to a solid support 
(for example, Covalink-NH, a polystyrene surface grafted with secondary amino 
groups, available from Nunc) through a bifunctional crosslinker (e.g., 
bis (sulfosuccinimidyl suberate) , available from Pierce). Additional spacing 
moieties can be added to reduce steric hindrance between the capture moiety 
and the surface of the solid support. 

Attaching Tags to Analytes for Sorting 

A family of oligoucleotide tag sequences can be conjugated to a 
population of analytes most preferably polynucleotide sequences in several 
different ways including but not limited to direct chemical synthesis, 
chemical coupling, ligation, amplification, and the like. Sequence tags that 
have been synthesized with primer sequences can be used for enzymatic • 
extension of the primer on the target for example, in PCR amplification. 

Detection of Single Nucleotide Polymorphisms Using Primer Extension 

There are a number of areas of genetic analysis where families of non 
cross hybridizing sequences can be applied including disease dagnosis, single 
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nucleotide polymorphism analysis, genotyping, expression analysis and the 
like. One such approach for genetic analysis referred to as the primer 
extension method (also known as Genetic Bit Analysis (Nikiforov et al, 
Nucleic Acids Res.; 22, 4167-4175: 1994; Head et al Nucleic Acids Res..; 25, 
5065-5071: 1997)) is an extremely accurate method for identification of the 
nucleotide located at a specific polymorphic site within genomic DNA. In 
standard primer extension reactions, a portion of genomic DNA containing a 
defined polymorphic site is amplified by PCR using primers that flank the 
polymorphic site. In order to identify which nucleotide is present at the 
polymorphic site, a third primer is synthesized such that the polymorphic 
position is located immediately 3' to the primer. A primer extension 
reaction is set up containing the amplified DNA, the primer for extension, up 
to 4 dideoxynucleoside triphosphates, each labelled with a different 
fluorescent dye and a DNA polymerase such as the Klenow subunit of DNA 
Polymerase 1. The use of dideoxy nucleotides ensure that a single base is 
added to the 3' end of .the primer, a site corresponding to the polymorphic 
site. In this way the identity of the nucleotide present at a specific 
polymorphic site can be determined by the . identity of the fluorescent dye- 
labelled nucleotide that is incorporated in each reaction. One major 
drawback to this approach is its low throughput. Each primer extension 
reaction is carried out independently in a separate tube. 

Universal sequences can be used to enhance the throughput of primer 
extension assay as follows. A region of genomic DNA containing multiple 
polymorphic sites is amplified by PCR. Alternately, several genomic regions 
containing one or more polymorphic sites each are amplified together in a 
multiplexed PCR reaction. The primer extension reaction is carried out as 
described above except that the primers used are chimeric, each containing a 
unique universal tag at the 5' end and the sequence for extension at the 3' 
end. In this way, each gene-specific sequence would be associated with a 
specific universal sequence. The chimeric primers would be hybridized to the 
amplified DNA and primer extension carried out as described above. This 
would result in a mixed pool of extended primers, each with a specific 
fluorescent dye characteristic of the incorporated nucleotide. Following the 
primer extension reaction, the mixed extension reactions are hybridized to an 
array containing probes that are reverse complements of the universal 
sequences on the primers. This would segregate the products of a number of 
primer extension reactions into discrete spots. The fluorescent dye present 
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at each spot would then identify the nucleotide incorporated at each specific 
location. 

Kits Using Families Of Tag Sequences 

The families of non cross-hybridizing sequences may be provided in kits 
for use in for example genetic analysis. Such kits include at least one set 
of non cross hybridizing sequences in solution or on a solid support. 
Preferably the sequences are attached to microparticles and are provided with 
buffers and reagents that are appropriate for the application. Reagents may 
include enzymes, nucleotides, fluorescent labels and the like that would be 
required for specific applications. Instructions for correct use of the kit 
for a given application will be provided. 

EXAMPLES 
EXAMPLE 1 

Demonstrate Non Cross Talk Behavior 

One hundred oligonucleotide probes corresponding to a family of non- 
cross talking oligonucleotides from Table I were synthesized by Integrated 
DNA Technologies • (IDT, Coralville IA) . These oligonucleotides incorporated a 
C 6 aminolink group coupled to the 5' end of the oligo through a C 1Q ethylene 
glycol spacer. These probes were used to prepare microarrays as follows. The 
probes were resuspended at a concentration of 50 uM in 150 mM NaP04, pH 8.5. 
The probes were spotted onto the surface of a SuperAldehyde slide (Telechem 
Int., Sunnyvale CA) using and SDDC-II microarray spotter (ESI , Toronto Ont) . 
The spots formed were approximately 120 \M in diameter with 200 n-M centre-to- 
centre spacing. Each probe was spotted 8 times on each microarray. Following 
spotting, the arrays were processed essentially as described by the slide 
manufacturer. Briefly, the arrays were treated with 67 mM sodium borohydride 
in PBS/EtOH (3:1) for 5 minutes then washed with 4 changes of 0.1% SDS . The 
arrays were not boiled. 

One hundred labelled oligonucleotide, targets were also synthesized by 
IDT. The sequence of these targets corresponded to the reverse complement 
of the' 100 probe. sequences . The targets were labelled at the 5' end with 
Cy3. 

Each Cy3-labeled target oligonucleotide was hybridized separately to 
two microarrays each of which contained all 100. oligonucleotide probes. 
Hybridizations were" carried out at 42°C for 2 hours in a 40 M-l reaction and 
contained 40 nM of the labelled target suspended in 10 mM TrisHGl, pH 8.3, 50 
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mM KC1, 0.1% Tween-20. These are low stringency hybridization conditions 
designed to provide a vigorous test of the performance of the family of non- 
cross hybridizing sequences. Hybridizations were carried out by depositing 
the hybridization solution on a clean cover slip then carefully positioning 
the microarray slide over the cover slip in order to avoid bubbles. The 
slide was then inverted and transferred to a humid chamber for incubation. 
Following hybridization, the cover slip was removed' and the microarray was 
washed in hybridization buffer for 15 minutes at room temperature. The slide 
was then dried by brief centrif ugation . 

Hybridized microarrays were scanned using a ScanArray Lite (GSI- 
Lumonics, Billerica MA) . The laser power and photomultiplier tube voltage 
used for scanning each hybridized microarray were optimized in order to 
maximize the signal intensity from the spots representing the perfect match. 

The results of a sample hybridization are shown in Figure 1. Figure la 
shows the hybridization pattern seen when a microarray containing all 100 
probes was hybridized with the target complementary to probe 181234. The 4 
sets of paired spots correspond to the probe complementary to the target. 
Figure lb shows the pattern seen when a similar array was hybridized with a 
mix of all 100 targets. 

EXAMPLE 2 

Tag sequences used in sorting polynucleotides 

The family of non cross hybridizing sequence tags or a subset thereof 
can be attached to oligonucleotide probe sequences during synthesis and used 
to generate amplified probe sequences. In order to test the feasibility of 
PCR amplification with non cross hybridizing sequence tags and subsequently 
addressing each respective sequence to its appropriate location on two- 
dimensional or bead arrays, the following experiment was devised. A 24mer 
tag sequence was connected in a 5' -3' specific manner to a p53 exon specific 
sequence (20mer reverse primer) . The connecting p53 sequence represented the 
inverse complement of the nucleotide gene' sequence . To facilitate the 
subsequent generation of single stranded DNA post-amplification the tag- 
Reverse primer was synthesized with. a phosphate .modification (P0 4 ) on the 5'- 
end. A second PCR primer was also generated for each desired exon, which ■ 
represented the Forward (5'-3') amplification primer. In this instance the 
Forward primer was labeled with a 5'-biotin modification to allow detection 
with Cy3-avidin or equivalent. 
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A practical example of the aforementioned description is as 
follows: For exon 1 of the human p53 tumor suppressor gene sequence 
the following tag-Reverse primer was generated: 

222087 222063 
5' - P04 - GATrGTAAGArrrGArAAAGTGTA-TCCAGGGAAGCGTGTCACCGTCGT - 3 ' 

Tag Sequence # 3 Exon 1 Reverse 

The numbering above the Exon-1 reverse primer represents the 'genomic 
nucleotide positions of the indicated bases. 

The corresponding Exon-1 Forward primer sequence is as follows: 

221873 221896 
5 ' -Biotin-TCATGGCGACTGTCCAGCTTTGTG-3 ' 

In combination these primers will amplify a product of 214 bp plus a 
24 bp tag extension yielding a total size of .238 bp. 
Once amplified, the PCR product was purified using a QIAquick PCR 
purification kit and the resulting DNA was quantified. To generate 
single stranded DNA, the DNA was subjected to _-exonuclease digestion 
thereby resulting in the exposure of a single stranded sequence . (anti- 
tag) complementary to the tag-sequence covalently attached to the solid 
phase array. The resulting product was heated to 95°C for 5 minutes and 
then directly applied to the array at a concentration of 10-50 nM. 
Following hybridization and concurrent sorting, the tag-Exon 1 
sequences were visualized using Cy3-streptavidin . In addition to 
direct visualization of the biotinylated product, the product itself 
can now act as a substrate for further analysis of the amplified 
region, such as SNP detection and haplotype determination. 

The present invention also includes a family of 1168 24mer 
polynucleotides that have been demonstrated to be minimally cross-hybridizing 
with each other. This family of polynucleotides is thus useful as a family 
of tags, and their complements as tag complements. 

In order to be considered for inclusion into the family, a sequence had 
to satisfy a certain number of rules regarding its composition. For example, 
repetitive regions that present potential hybridization problems such as four 
or more of a similar base (e.g., AAAA or TTTT) or pairs of Gs were forbidden. 
Another rule is that each sequence contains exactly six Gs and no Cs, in 
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order to have sequences that are more or less isothermal. Also required for 
a 24mer to be included is that there must be at most six bases between every 
neighboring pair of Gs. Another way of putting this is that there are at 
most six non-Gs between any two consecutive Gs . Also, each G nearest the 
5 '-end (resp. 3 f -end) of its oligonucleotide (the left-hand (resp. right- 
hand) side as written in Table II) was required to occupy one of the first to 
seventh positions (counting the 5' -terminal (resp. S'-terminal) position as 
the first position.) 

The process used to design families of sequences that do not 
exhibit cross-hybridization behavior is illustrated generally in Figure 
5) . Depending on the application for which these families of sequences 
will be used, various rules are designed. A certain number of rules can 
specify constraints for sequence composition (such as the ones 
described in the previous paragraph) . The other rules are used to 
judge whether two sequences are too similar. Based on these rules, a 
computer program can derive families of sequences that exhibit minimal 
or no cross-hybridization behavior. The exact method used by the 
computer program is not crucial since various computer programs can 
derive similar families based on these rules. Such a program is for 
example described in international patent application No. 
PCT/CA 01/00141 published under WO 01/59151 on August 16, 2001. Other., 
programs can use different methods, such as the ones summarized below. 

A first method of generating a maximum number of minimally cross- 
hybridizing polynucleotide sequences starts with any number of non- 
cross-hybridizing sequences, for example just one sequence, and 
increases the family as follows. A certain number of sequences is 
generated and compared to the sequences already in the family. The 
generated sequences that exhibit too much similarity with sequences 
already in the family are dropped. Among the "candidate sequences" 
that remain, one sequence is selected and added to the family. The 
other candidate sequences are then compared to the selected sequence, 
and the ones that show too much similarity are -dropped. A new sequence 
is selected from the remaining candidate sequences, if any, and added 
to the family, and so on until there are no candidate sequences left. 
At this stage, the process can be repeated (generating a certain number 
of sequences and comparing them to the sequences in the -family, etc.) 
as often as desired. The family obtained at the end of this method 
contains only minimally cross-hybridizing sequences. 
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A second method of generating a maximum number of minimally 
cross-hybridizing polynucleotide sequences starts with a fixed-size 
family of polynucleotide sequences. The sequences of this family can be 
generated randomly or designed by some other method.- Many sequences in 
this family may not be compatible with each other, because they show 
too much similarity and are not minimally cross-hybridizing. Therefore, 
some sequences need to be replaced by new ones, with less similarity. 
One way to achieve this consists of repeatedly replacing a sequence of 
the family by the best (that is, lowest similarity) sequence among a 
certain number of (for example, randomly generated) sequences that are 
not part of the family. This process can be repeated until the family 
of sequences shows minimal similarity, hence minimal cross-hybridizing, 
or until a set number of replacements has occurred. If, at the end of 
the process, some sequences do not obey the similarity rules that have 
been set, they can be taken out of the family, thus providing a 
somewhat smaller family that only contains minimally cross-hybridizing 
sequences. Some additional rules can be added to this method in order 
to make it more efficient, such as rules to determine which sequence 
will be replaced. 

Such methods have been used to obtain the 1168 non-cross- 
hybridizing tags of Table II that are also the subject of this patent 
application. 

One embodiment of the invention is a composition comprising molecules 
for use as tags or tag complements wherein each molecule comprises an 
oligonucleotide selected from a set of oligonucleotides based on the group of 
sequences set out in Table IIA, wherein each of the numeric identifiers 1 to 
3 (see the Table) is a nucleotide base selected to be different from the 
others of 1 to 3 . According to this embodiment, several different families 
of specific sets of oligonucleotide sequences are described, depending upon . 
the assignment of bases made to the numeric identifiers 1 to 3 . 

The sequences contained in Table II have a mathematical relationship to 
each other, described as follows. 

Let S and T be two DNA sequences of lengths s and t respectively. While 
the term "alignment" of nucleotide sequences is wid_ely used in the field of 
biotechnology, in the context of this invention the term has a specific 
meaning illustrated here. An alignment of S and T is a 2xp matrix A (with p s 
s and p z t) such that the first (or second) row ,of A contains the characters 
of S (or T respectively) in order, interspersed with p-s (or p-t 



67 

respectively) spaces. It assumed that no column of the alignment matrix 
contains two spaces, i.e., that any alignment in which a column contains two 
spaces is ignored and not considered here/ The columns containing the- same base 
in both rows are called matches, while the columns containing different bases 
are called mismatches . Each column of an alignment containing a space in its 
first row is called an insertion and each column containing a space in its 
second row is called a deletion while a column of the alignment containing a 
space in either row is called an indel . Insertions and deletions within a 
sequence are represented by the character . A gap is a continuous sequence 
of spaces in one of the rows (that is neither immediately preceded nor 
immediately followed by another space in the same row) , and the length of a gap 
is the number of spaces in that gap. An internal gap is one in which its first 
space is preceded by a base and its last space is followed by a base and an 
internal indel is an indel belonging to an internal gap. Finally, a block is a 
continuous sequence of matches (that is neither immediately preceded nor 
immediately followed by another match) , and the length of a block is the number 
of matches in that block. In order to illustrate these definitions, two 
sequences S = TGATCGTAGCTACGCCGCG (of length s = 19; SEQ ID NO: 1169) and T = 
CGTACGATTGCAACGT (of length t" = 16j SEQ ID N0:1170) are considered. Exemplary 
alignment R x of S and T (with p = 23) is: 
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Columns 1 to 4, 9, 10, 12 and 2 0 to 2 3 are indels, columns 6, 7, 8, 11, 
13, 14, 16, 17 and 18 are matches, and columns 5, 15 and 19 are mismatches. 
Columns 9 and 10 form a gap of length 2, while columns 16 to 18 form a block of 
length 3. Columns 9, 10 and 12 are internal indels. 

A score is assigned to the alignment A of two sequences by assigning 
weights to each of matches, mismatches and gaps as follows: 

• the reward for a match m, 

• the penalty for a mismatch mm, 

• the penalty for opening a gap og m 
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• the penalty for extending a gap eg. 
Once these values are set, a score to each column of the alignment is • 
assigned according to the following rules: 

1. assign 0 to each column preceding the first match and to each 
column following the last match. 

2. for each of the remaining columns, assign m if it is a match, -mm 
if it is a mismatch, -og-eg if it is the first indel of a gap, -eg 
if it is an indel but not the first indel of a gap. 

The score of the alignment A is the sum of the scores of its columns. An 
alignment is said to be of maximum score if no other alignment of the same 
two sequences has a higher score (with the same values of m, mm, og and eg) . 
A person knowledgeable in the field will, recognize this method of scoring an 
alignment as scoring a local (as opposed to global) alignment with affine gap 
penalties (that is, gap penalties that can distinguish between the first 
indel of a gap and the other indels) . It will be appreciated that the total 
number of indels that open a gap is the same as the total number of gaps and 
that an internal indel is not one of those assigned a 0 in rule (1) above. 
It will also be noted that foregoing rule (1) assigns a 0 for non-internal 
mismatches. An internal mismatch is a mismatch that is preceded and followed 
(not necessarily immediately) by a match. 

As an illustration, if the values of m, mm, og and eg are set to 3, 1, 
2 and 1 respectively, alignment Ri has a score of. 19, determined as shown 
below: 

Scoring of Alignment R x 



G C 



C I G T 



00000333 -3 -1 3 -3 33 -1 3330 0000 



Note that for two given sequences S and T, there are numerous alignments. 
There are often several alignments of maximum score. 

Based on these alignments, five sequence similarity measures are • 
defined as follows. For two sequences S and T, and weights {m, mm, og, eg } 

• Ml is the maximum number of matches over all alignments free of 
internal indels; 

• M2 is the maximum length of a block over all alignments; 
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• M3 is the maximum number of matches over all alignments of maximum 
score; 

M4 is the maximum sum of the lengths, of the longest two blocks over all 
alignments of maximum score; 

• M5 is the maximum sum^ of the lengths of all the blocks of length at 
least 3, over all alignments of maximum score. 

Notice that, by definition, the following inequalities between these 
similarity measures are obtained: M4 z M3 and M5 z M3 . Also, in order to 
determine M2 it is sufficient to determine the maximum . length of a block over 
all alignments free of internal indels. For two given sequences, the values 
of M3 to M5 can vary depending on the values of the weights [m, mm, og, eg }, 
but not Ml and M2. 

For weights {3, 1, 2, 1}, the illustrated alignment is not a maximum- 
score alignment of the two example sequences. But for weights {6, 6, 0, 6} 
it is; hence this alignment shows that for these two example sequences, and 
weights {6, 6, 0, 6}, M2 z 3, M3 * 9, M4 * 6 and M5 s 6. In order to 
determine the exact values of Ml to M5, all the necessary alignments need to 
be considered. Ml arid M2 can be found by looking at the s+t-1 alignments 
free of internal indels, where s and t' are the lengths of the two sequences 
considered. Mathematical tools known as dynamic programming can be 
implemented on a computer and used to determine M3 to M5 in a very quick way. 
Using a computer program to do these calculations, it was determined that: 

• with the weights {6, 6, 0, 6}, Ml = 8, M2 = 4, M3 = 10, M4 « 6 and M5 = 

• with the weights {3, 1, 2, 1}, Ml = 8, M2 = 4, M3 = 10, M4 = 6 and M5 = 
4. 

According to the preferred embodiment of this invention, two sequences S and 
T each of length 24 are too similar if at least one of the following happens: 

• Ml > 16 or 

• M2 > 13 or 

• M3 > 20 or 

• M4 > 16 or 

• M5 > 19 

when using either weights {6, 6, 0, 6}, or {6, 6, 5, 1}., or. {6, 2, 5, 1}, or 
{6, 6, 6, 0}. In other words, the five similarity measures between S and T 
are determined for each of the above four sets of weights, and checked 
against these thresholds (for a total of 20 tests) . 



The above "thresholds of 16, 13, 20, 16 and 19, and the above sets of 
weights, were used to obtain the sequences listed in Table I. Additional 
sequences can thus be added to those of Table I as long as the above alignment 
rules are obeyed for all sequences. 

It is also possible to alter thresholds Ml, M2 , etc., while remaining 
within the scope of this invention. It is thus possible to substitute or add 
sequences to those of Table II, or more generally to those of Table IIA to 
obtain other sets of sequences that would also exhibit reasonably low cross- 
hybridization. More specifically, a set of 24mer sequences in which there are 
no two sequences that are too similar, where too similar is defined as: 

• Ml > 19 or 

• M2 > 17 or 

• M3 > 21 or 

• M4 > 18 or 

• MS > 20 

when using either weights {6,6,0,6}, or {6,6,5,1}, or {6,2,5,l}, or {6,6,6,0}, 
would also exhibit low cross -hybridization. Reducing any of the threshold values 
provides sets of sequences with even lower cross -hybridization . Alternatively, 
'too similar' can also be defined as: 

• Ml > 19 or 

• M2 > 17 or 

• M3 > 21 or 

• M4 > 18 or 

• M5 > 20 

when using either weights {3,1,2,1}. Alternatively, other combinations of 
weights will "lead to sets of sequences with low cross-hybridization. 

Notice that using weights {6,6,0,6} is equivalent to using weights {l, 1, 
0,1}, or weights {2,2,0,2}, ... (that is, for any two sequences, the values of Ml 
to M5 are exactly the same whether weights {6,6,0,6} or { 1 , 1 , 0 , l} ' or {2,2,0,2} 
or any other multiple of {1,1,0,1} is used). 

When dealing with sequences of length other than 24, or sequences of 

various lengths, the definition of similarity can be adjusted. Such adjustments 

are obvious to the persons skilled in the art. For example, when comparing a 

sequence of length LI with a sequence of length L2 (with L1<L2) , they can be 

considered as too similar when 

Ml > 19/24 x LI 
M2 > 17/24 x LI 
M3 > 21/24 x LI 
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M4 > 18/24 X LI 
M5 > 20/24 X LI 

when using either weights {6, 6, 0, 6} , or {6, 6, 5, l}, or {6, 2, 5, l} or {6, 
6, 6, 0}. 

5 Polynucleotide sequences can be composed of a subset of natural 

bases most preferably A, T and G. Sequences that are deficient in one 
base possess useful characteristics, for example, in reducing potential 
secondary structure formation or reduced potential for cross hybridization 
with nucleic acids in nature. Also, it is preferable to have tag 

10 sequences that behave isothermally . This can be achieved for example by 
maintaining a constant base composition for all sequences such as six Gs 
and eighteen As or Ts l for each sequence. Additional sets of sequences can 
be designed by extrapolating on the original family of non-cross- 
hybridizing sequences by simple methods known to those skilled in the art. 

15 In order to validate the sequence set, a subset of sequences from the 

family of 1168 sequence tags was selected and characterized, in terms of the 
ability of these sequences to form specific duplex structures with their 
complementary sequences, and the potential for cross-hybridization within the 
sequence set. See Example 4, below. The . subset of 100 sequences was randomly 

20 selected, and analyzed using the Luminex 100 LabMAP™ platform. The 100 sequences 
were chemically immobilized onto the set of 100 different Luminex microsphere 
populations, such that each specific sequence was coupled to one spectrally 
distinct microsphere population. The pool of 100 microsphere -immobilized probes 
was then hybridized with each of the 100 corresponding complementary sequences. 

25 Each sequence was examined individually for its specific hybridization with its 
complementary sequence, as well as for its non-specific hybridization with the 
other 99 sequences present in the reaction. This analysis demonstrated the 
propensity of each sequence to hybridize only to its complement (perfect match) , 
and not to cross -hybridize appreciably with any of the other oligonucleotides 

30 present in the hybridization reaction. 

It is within the capability of a person skilled in the art, given the 
family of sequences of Table II, to modify the sequences, or add other sequences 
while largely retaining the property of minimal cross-hybridization which the 
polynucleotides of Table II have been demonstrated to have. 

35 There are 1168 polynucleotide sequences given in Table II. Since all 1168 

of this family of polynucleotides can work with each other as a 
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minimally cross-hybridizing set, then any plurality of polynucleotides that is a 
subset of the 1168 can also act as a minimally cross-hybridizing set of 
polynucleotides. An application in which, for example,. 30 molecules are to be 
sorted using a family of polynucleotide tags and tag complements could thus use 
5 any group of 30 sequences shown in Table II. This is not to say that some 

subsets may be found in a practical sense to be more preferred than others. For 
example, it may be found that a particular subset is more tolerant of a wider 
variety of conditions under which hybridization is conducted before the degree 
of cross -hybridization becomes unacceptable. 

10 It may be desirable to use polynucleotides that are shorter in length than 

the 24 bases of those in Table II. A family of subsequences (i.e., subframes of 
the sequences illustrated) based on those contained in Table II having as few as 
10 bases per sequence could be chosen, so long as the subsequences are chosen to 
retain homological properties between any two of the sequences of the family 

15 important to their non cross -hybridization . 

The selection of sequences using this approach would be amenable to a 
computerized process. Thus for example, a string of 10 contiguous bases of the 
first 24mer of Table II could be selected: AAATTGTGAA AGATTGTTTGTGTA (SEQ ID 
NO:l) . 

- 20 The same string of contiguous bases from the second 24mer could then be 

selected and compared for -similarity against the first chosen sequence: 
GTTAGAGTTA ATTGTATTTGATGA (SEQ ID NO : 2 of Table II) . A systematic pairwise 
comparison could then be carried out to determine if the similarity requirements 
are violated. If the pair of sequences does not violate any set property, a 

25 -lOmer subsequence can be selected from the third 24mer sequence of Table II, and 
compared to each of the first two lOmer sequences (in a pairwise fashion to 
determine its compatibility therewith, etc. In this way a family of lOmer 
sequences may be developed. 

It is within the scope of this invention, to obtain families of sequences 

30 containing llmer, 12mer, 13mer, 14mer, 15mer, 16mer, 17mer, lBmer, 19mer, 20mer, 
21mer, 22mer and 23mer sequences by analogy to that shown for lOmer sequences. 
It may be desirable to have a family of sequences in which there are sequences 
greater in length than the 24mer sequences shown in Table II. 4 It is within the 
capability of a person skilled in the art, given the family of sequences shown 

35 in Table II, to obtain' such a family of sequences. One possible approach would 
be to insert into each sequence at one or more . locations a nucleotide, non- 
natural base or analogue such that the longer 
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sequence should not have greater similarity than any two of the original non- 
cross-hybridizing sequences of Table II and the addition of extra bases to 
the tag sequences should not result in a major change in the thermodynamic 
properties of the tag sequences of that set for example the GC content must 
be maintained between 10%-40% with a variance from the average of 20%. This 
method of inserting bases could be used to obtain, for example, a family of 
sequences up to 40 bases long. 

Given a particular family of sequences that can be used as a family of 
tags (or tag complements), e.g., those of Table II, a skilled person will 
readily recognize variant families that work equally as well. 

Again taking the sequences of Table II for example, every T could be 
converted to an A and vice versa and no significant change in the cross- 
hybridization properties would be expected to be observed. This would also 
be true if every G were converted to a C. 

Also, all of the sequences of a family could be taken to be constructed 
in the 5' -3' direction, as is the convention, or all of the constructions of 
sequences could be in the opposition direction (3' -5'). 

There are additional modifications that can be carried out. For 
example, C has not been used in the family of sequences. Substitution of C 
in place of one or more G's of a particular sequence would yield a sequence 
that is at least as low in homology with every other sequence of the family 
as was the particular sequence chosen for modification. It is thus possible 
to . substitute C in -place of one or more G's in any of the sequences shown in 
Table II. - Analogously, substituting of C in place of one or more A' s is 
possible, or substituting C in place of one or T's is possible. 

It is preferred that the sequences of a given family are of the same, 
or roughly the same length. Preferably, all the sequences of a family of 
sequences of this invention have a length that is within five bases of the 
base-length of the average of the family. More preferably, all sequences are 
within four bases of the average base-length. Even more preferably, all or 
almost all sequences are within three bases of the average base-length of the 
family. Better still, all or almost all sequences have a length that is 
.within two of the base-length of the average of the family, and even better 
still, within one of the base-length of the average of the family. 

It is also possible for a person skilled in the art to derive sets of 
sequences from the family of sequences described in this specification and 
remove sequences that would be expected to have undesirable hybridization 
properties. 
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EXAMPLE 3 - Cross Talk Behavior of Sequence on Beads 

A group of 100 of the sequences of Table I was tested for feasibility for 
use as a family of minimally cross-hybridizing oligonucleotides. The 100 
sequences selected are separately indicated in Table I along with the numbers 
5 assigned to the sequences in the tests. 

The tests were conducted using the Luminex LabMAP™ platform available from 
Luminex Corporation, Austin, Texas, U.S.A. The one hundred sequences^-- -used as 
probes, were synthesized as oligonucleotides by^ Integrated DNA Technologies 
(IDT, Coralville, Iowa, U.S.A.) . Each probe included a C 6 aminolink group 

10 coupled to the 5' -end of the oligonucleotide through a C 12 ethylene glycol 

spacer. The C 6 aminolink molecule is a six carbon spacer containing an amine 
group that can be used for attaching the oligonucleotide to a solid support. 
One hundred oligonucleotide targets (probe complements) , the sequence of each 
being the reverse complement" of the 100 probe sequences, were also synthesized 

15 by IDT. Each target was labelled at its 5' -end with biotin. All 

oligonucleotides were purified using standard desalting procedures, and were 
reconstituted to a concentration of approximately 200 fiM in sterile, distilled 
water for use. Oligonucleotide concentrations were determined 

spectrophotometrically using extinction coefficients provided by the supplier. 

20 Each probe was coupled by its amino linking group to a carboxylated 

fluorescent microsphere of the LabMAP system according to the Luminex 200 
protocol. The microsphere, or bead, for each probe sequence has unique, 
or spectrally distinct, light absorption characteristics which permits 
each probe to be distinguished from the other probes. Stock bead pellets 

25 were dispersed by sonication and then vortexing. For each bead 

population, approximately five million microspheres (400 uL) were removed 
from the stock tube using barrier tips and added to a 1.5 mL Eppendorf 
tube (USA Scientific) . The microspheres were then centrifuged, the 
supernatant was removed, and beads were resuspended in 25 uL of 0.2 M MES 

30 (2- (N-morpholino) ethane sulfonic acid) (Sigma), pH 4.5, followed by 

vortexing and sonication. One nmol of each probe (in a 25 uL volume) was 
added to its corresponding bead population. A volume of 2 . 5 uL of EDC 
cross-linker (l-ethyl-3- (3 -dimethylaminopropyl ) carbodiimide hydrochloride 
(Pierce), prepared immediately before use by adding 1.0 mL of sterile ddH 2 0 

35 to 10 mg of EDC powder, was added to each microsphere population. Bead 
mixes were then incubated for 3 0 minutes at room temperature in the dark 
with periodic vortexing. A second 2 . 5 uL aliquot of freshly prepared EDC 
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solution was then added followed by an additional 3 0 minute incubation in 
the dark. Following the second EDC incubation, 1.0 mL of 0.02% Tween-2 0 
(BioShop) was added to each bead mix and vortexed. The microspheres were 
centrifuged, the supernatant was removed, and the beads were resuspended 
5 in 1.0 mL of 0.1% sodium dodecyl sulfate (Sigma). The beads were 

centrifuged again and the supernatant removed. The coupled beads were 
resuspended in 100 \iL of 0.1 M MES pH 4.5. Bead concentrations were then 
determined by diluting each preparation 100-fold in ddH 2 0 and enumerating 
using a Neubauer BrightLine Hemacytometer. Coupled beads were stored as 

10 individual populations at 2-8°°C protected from light. 

The relative oligonucleotide probe density on each bead population 
was assessed by Terminal Deoxynucleotidyl Transferase (TdT) end-labelling 
with biotin-ddUTPs . TdT was used to label the 3' -ends of single-stranded 
DNA with a labeled ddNTP. Briefly, 180 |iL of the pool of 100 bead 

15 populations (equivalent to about 4000 of each bead type) to be used for 
hybridizations was pipetted into an Eppendorf tube and centrifuged. The. 
supernatant was removed, and the beads were washed in lx TdT buffer. The 
beads were then incubated with a labelling reaction mixture, which 
consisted of 5x TdT buffer, 25mM CoCl 2 , and 1000 pmol of biotin- 16-ddUTP 

20 (all reagents were purchased from Roche) . The total reaction volume was 
brought up to 85.5 [ih with sterile, distilled H 2 0, and the samples were 
incubated in the dark for 1 hour at 37 °C. A second aliquot of enzyme was 
added, followed by a second 1 hour incubation. Samples were run in 
duplicate, as was the negative control, which contained all components 

25 except the TdT. In order to remove unincorporated biotin-ddUTP; the beads 
were washed 3 times with 200 uL of hybridization buffer, and the beads were 
resuspended in 50 uL of hybridization buffer following the final wash. The 
biotin label was detected spectrophotometrically using SA-PE 
(streptavidin-phycoerythrin conjugate) . The streptavidin binds to biotin 

30 and the phycoerythrin is spectrally distinct from the probe beads. . The 
lOmg/mL stock of SA-PE was diluted 100 -fold in hybridization buffer, and 
15 uL of the diluted SA-PE was added directly to each reaction and 
incubated for 15 minutes at 37°Celsius. The reactions were analyzed on 
the Luminex 100 LabMAP . Acquisition parameters were set to measure 100 

35 events per bead using a sample volume of 50 |iL. 

The results obtained are shown in Figure 2 . As can be seen the Mean 
Fluorescent Intensity (MFI) of the beads varies from 277.75 to 2291.08, a 
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range of 8.25 -fold. Assuming that the labelling reactions are complete - 
for all of the oligonucleotides, this illustrates the signal intensity 
that would be obtained for each type of bead at this concentration if the 
target (i.e., labelled complement) was bound to the probe sequence to the 
5 full extent possible. 

The cross -hybridization of targets to probes was evaluated as 
follows. 100 oligonucleotide probes linked to 100 different bead 
populations, as described above, were combined to generate a master bead 
mix, enabling multiplexed reactions to be carried out. The pool of 

10 microsphere-immobilized probes was then hybridized individually with each 
biotinylated target. Thus, each target was examined individually for its 
specific hybridization with its . complementary bead-immobilized sequence, 
as well as for its non-specific hybridization with the other 99 bead- 
immobilized universal sequences present in the reaction. For each 

15 hybridization reaction, 25 uL bead mix (containing about 2500 of each bead 
population in hybridization buffer) was added to each well" of a 96 -well 
Thermowell PCR plate and equilibrated at 37 °C. Each target was diluted to 
a final concentration of 0.002 fmol/uL in hybridization buffer, and 25 uL 
(50 fmol) was added to each well, giving a final reaction volume of 50 uL. 

20 Hybridization buffer consisted of 0.2 M NaCl, 0.1 M Tris, 0.08% Triton X- 
100, pH 8.0 and hybridizations were performed at 37 °C for 30 minutes. 
Each target was analyzed in triplicate and six background samples (i.e. no 
target) were included in each plate. A SA-PE conjugate was used as a 
reporter, as described above. The 10 mg/mL stock of SA-PE was diluted 

25 100 -fold in hybridization buffer, and 15 uL of the diluted SA-PE was added 
directly to each reaction, without removal of unbound target, and 
incubated for 15 minutes at 37 °C. Finally, an additional 35 jliL of 
hybridization buffer was added to each well, resulting in" a final volume . 
of 100 uL per well prior to analysis on the Luminex 100 LabMAP. Acquisition ' 

30 parameters were set to measure 100 events per bead using a sample volume 
of 80 [iL. 

The percent hybridization was calculated for any event in which the 
NET MFI was at least 3 times the zero target background. In. other words, 
a calculation was made for any sample where (MFI 

sample zero target 

35 background) /MFI 2ero target background >, 3 . 

A "positive" cross-talk event (i.e., significant mismatch or cross- 
hybridization) was defined as any event in which the net median fluorescent 
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intensity (MFI samp i e -MFI zero target background) generated by a mismatched hybrid was 
greater than or equal to the arbitrarily set limit of 10% that of the perfectly 
matched hybrid determined under identical conditions. As there are 100 probes 
and 100 targets, there are 100 x 100 = 10,0000 possible different interactions 
5 possible of which 100 are the result of perfect hybridizations. The remaining 
9900 result from hybridization of a target with a mismatched probe. 

The results obtained are illustrated in Figure 3 . The ability of 
each target to be specifically recognized by its matching probe is shown. 
Of the possible 9900 non-specific hybridization events that could have 

10 occurred when the 100 targets were each exposed to the pool of 100 probes, 
6 events were observed. Of these 6 events, the highest non-specific event 
generated a signal equivalent to 10.2 % of the signal observed for the 
perfectly matched pair (i.e. specific hybridization event) . 

Each of the 100 targets was thus examined individually for specific 

15 hybridization with its complement sequence as incorporated onto. a 
microsphere, as well as for non-specific hybridization with the 
complements of the other 99 target sequences. Representative 
hybridization results for target 16 (complement of probe 16, Table I) are 
- shown in Figure 4. Probe 16 was found to hybridize only to its perfectly- 

20 matched target. No cross -hybridization with any of the other 99 targets 
was observed. 

The foregoing results demonstrate the possibility of incorporating 
the 210 sequences of Table I, or any subset thereof, into a multiplexed 
system with the expectation that most if not all sequences can be 
25 distinguished from the others by hybridization. That is, it is possible 
to distinguish each target from the other targets by hybridization of the 
target with its precise complement and minimal hybridization with 

complements of the other targets. 
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Methods For Synthesis Of Oligonucleotide Families 

Preferably oligonucleotide sequences of the invention are. synthesized 
directly by standard phosphoramidite synthesis approaches and the like 
(Caruthers et al, Methods in ;Enzymology; 154, 287-313: 1987; Lipshutz et al, 
Nature Genet.; 21, 20-24: 1999; Fodor et al, Science; 251, 763-773: 1991). 
Alternative chemistries involving non natural bases such as peptide nucleic 
acids or modified nucleosides that offer advantages in duplex stability may 
also be used (Hacia et al; Nucleic Acids Res ;27: 4034-4039, 1999; Nguyen et 
al, Nucleic Acids Res.;27, 1492-1498: 1999; Weiler et al, Nucleic Acids Res.; 
25, 2792-2799:1997). It is also possible to synthesize the oligonucleotide 
sequences of this invention with alternate nucleotide backbones such as 
phosphorothioate or phosphoroamidate nucleotides. Methods involving 
synthesis through the addition of blocks of sequence in a stepwise -manner may 
also be employed (Lyttle et al, Biotechniques, 19: 274-280 (1995) . Synthesis 
may be. carried out directly on the substrate to be used as a solid phase 
support for the application or the oligonucleotide can be cleaved from the 
support for use in solution or coupling to a second support. 

Solid Phase Supports 

There are several different solid phase supports that can be used with 
the invention. They include but are not limited to slides, plates, chips, 
membranes, beads, microparticles and the like. The solid phase supports can 
also vary in the materials that they are composed of including plastic, 
glass, silicon, nylon, polystyrene, silica gel, latex and the like. The 
surface of the support is coated with the complementary tag sequences by any 
conventional means of attachment. 

In preferred embodiments, the family of tag complement sequences is 
derivatized to allow binding to a solid support. Many methods of 
derivatizing a nucleic acid for binding to a solid support are known in the 
art (Hermanson G., Bioconjugate Techniques; Acad. Press: 1996).. The sequence 
tag may be bound to a solid support through covalent or non-covalent bonds 
(Iannone et al, Cytometry; 39: 131-140, 2000; Matson et al, Anal. Biochem.; 
224: 110-106, 1995; Proudnikov et al, Anal Biochem; 259: 34-41, 1998; 
Zammatteo et al, Analytical Biochemistry; 280:143-150, 2000). The sequence 
tag can be conveniently derivatized for binding to a solid support by 
incorporating modified nucleic acids in the terminal 5' or 3' locations. 
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A variety of moieties useful for binding to a solid support (e.g., 
biotin, antibodies, and the like), and methods for attaching them to nucleic 
acids, are known in" the art. For example, an amine-modif ied nucleic acid 
base (available from, eg., Glen Research) may be attached to a solid support 
5 (for example, Covalink-NH, a polystyrene surface grafted with secondary amino 
groups, available from Nunc) through a bifunctional crosslinker (e.g., 
bis (sulf osuccinimidyl suberate) , available from Pierce). Additional . spacing 
moieties can be added to reduce steric hindrance between the capture moiety 
and the surface of the solid support. 

10 

Attaching Tags to Analytes for Sorting 

A family of oligonucleotide tag sequences can be conjugated to a 
population of analytes most preferably polynucleotide sequences in several 
different ways including but not limited to direct chemical synthesis, 
15 chemical coupling, ligation, amplification, and the like. Sequence tags that 
have been synthesized with primer sequences can be used for enzymatic 
extension of the primer on the target for example in PCR amplification. 

Detection of Single Nucleotide Polymorphisms Using Primer Extension 

20 There are a number of areas of genetic analysis where families of non- 

cross-hybridizing sequences can be applied including disease diagnosis, 
single nucleotide polymorphism analysis, genotyping, expression analysis and 
the like. One such approach for genetic analysis, referred to as the primer 
extension method (also known as Genetic Bit Analysis (Nikiforov et al, 

25 Nucleic Acids Res.; 22, 4167-4175: 1994; Head et al Nucleic Acids Res.; 25, 
5065-5071: 1997)), is an extremely accurate method for identification of the 
nucleotide located at a specific polymorphic site within genomic DNA. In 
standard primer extension reactions, a portion of genomic DNA containing a 
defined polymorphic site is amplified by PCR using primers that flank the 

30 polymorphic site. In order to identify which nucleotide is present at the 
polymorphic site, a third primer is synthesized such that the polymorphic 
position is located immediately 3' to the primer. A primer extension 
reaction is set up containing the amplified DNA, the primer for extension, up 
* to 4 dideoxynucleoside triphosphates (each labeled with a different 

35 fluorescent dye) and a DNA polymerase such as the Klenow subunit of DNA 

Polymerase 1. The use of dideoxy nucleotides ensures that a single base is- 
added to the 3' end of the primer, a site corresponding to the polymorphic 
site. In this way the identity of the nucleotide present at a specific 



polymorphic site can be determined by the identity of the fluorescent dye- 
labeled nucleotide that is incorporated in each reaction. One major drawback 
to this approach is its low throughput. Each primer extension reaction is 
carried out independently in a separate tube . 

Universal sequences can be used to enhance the throughput of primer 
extension assay as follows. A region of genomic DNA containing multiple 
polymorphic sites is amplified by. PCR. Alternatively, several genomic 
regions containing one or more polymorphic sites each are amplified together 
in a multiplexed PCR reaction. The primer extension reaction is carried out 
as described above except that the primers used are chimeric, each containing 
a unique universal tag at the 5' end and the sequence for extension at the 3' 
end. In this way, each gene-specific sequence would be associated with a * 
specific universal sequence. The chimeric primers would be hybridized to the 
amplified DNA and primer extension is carried out as described above. This 
would result in a mixed pool of extended primers, each with a specific 
fluorescent dye characteristic of the incorporated nucleotide. Following the 
primer extension reaction, the mixed extension reactions are hybridized to an 
array containing probes that are reverse complements of the universal 
sequences on the primers. This would segregate the products of a number of. 
primer extension reactions into discrete spots. The fluorescent dye present 
at each spot would then identify the nucleotide incorporated at each specific 
location. A number of additional methods .for the detection of single 
nucleotide polymorphisms, including but not limited to, allele specific 
polymerase chain reaction (ASPCR) , allele specific primer extension (ASPE) 
and oligonucleotide ligation assay (OLA) can be performed by someone skilled 
in the art in combination with the tag sequences described" herein. 

Kits Using Families Of Tag Sequences 

The families of non cross-hybridizing sequences may be provided in kits 
for use in for example genetic analysis. Such kits include at least one set 
of non-cross-hybridizing sequences in solution or. on a solid support. 
Preferably the sequences are attached to microparticles and are provided with 
buffers and reagents that are appropriate for the application. Reagents may 
include enzymes, nucleotides, fluorescent labels- and the like that would be 
required for specific applications. Instructions for correct use of the kit 
for a given application will be provided. 
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EXAMPLES 

EXAMPLE 4 - Cross Talk Behavior of Sequence on Beads 

A group of 100 sequences, randomly selected from Table II, was 
tested for feasibility for use as a family of minimally cross -hybridizing 
5 oligonucleotides. The 100 sequences selected are separately indicated in 
Table II along with the numbers assigned to the sequences in the tests. 

The tests were conducted using the Luminex LabMAP™ platform 
available from Luminex Corporation, Austin, Texas, U.S.A. The one hundred 
sequences, used as probes, were synthesized as oligonucleotides by 

10 Integrated DNA Technologies (IDT, Coralville, Iowa, U.S.A.). Each probe 

included a C 6 aminolink group coupled to the 5' -end of the oligonucleotide 
through a C 12 ethylene glycol spacer. The C 6 aminolink molecule is a six 
carbon spacer containing an amine group that can be used for attaching the 
oligonucleotide to a solid support. One hundred oligonucleotide targets 

15 (probe complements) , the sequence of each being the reverse complement of 
the 100 probe sequences, were also synthesized by IDT. Each target was 
labelled at its 5 '-end with biotin. All oligonucleotides were purified 
using standard desalting procedures, and "were reconstituted to a 
concentration of approximately 200 uM in sterile, distilled water for use. 

20 Oligonucleotide concentrations were determined spectrophotometrically 
using extinction coefficients provided by the supplier. 

Each probe was coupled by its amino linking group to a carboxylated 
fluorescent microsphere of the LapMAP system according to the Luminex 100 protocol. 
The microsphere, or bead, for each probe sequence has unique, or spectrally 

25 distinct, light absorption characteristics which permits each probe to be 
distinguished from the other probes. Stock bead pellets were dispersed by 
sonication and then vortexing. For each bead population, five million microspheres 
(400 uL) were removed from the stock tube using barrier tips and added to a 1.5 mL 
Eppendorf tube (USA Scientific) . The microspheres were then centrifuged, the 

30 supernatant was removed, and beads were resuspended in 25 uL of 0.2 M MES (2-(N- 
morpholino) ethane sulfonic acid) (Sigma), pH 4.5, followed by vortexing and 
sonication. One nmol of each probe (in a 25 uL volume) was added to its corresponding 
bead population. A volume of 2.5 uL of EDC cross-linker (1 -ethyl -3 - (3- 
dimethylaminopropyl) carbodiimide hydrochloride (Pierce) , prepared immediately before 

35 use by adding 1.0 mL of sterile ddH 2 0 to 10 mg of EDC powder, was added to each 
microsphere population. Bead mixes were then incubated for 30 minutes at room 
temperature in the dark with periodic vortexing. A second 2.5 uL aliquot 
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of freshly prepared. EDC solution was then added followed by an additional 3 0 minute 
incubation in the dark. Following the second EDC incubation, 1.0 niL of 0.02% Tween- 
20 (BioShop) was added to each bead mix and vortexed. , The microspheres were 
centrifuged, the supernatant was removed, and the beads were resuspended in 1.0 mL of 
0.1% sodium dodecyl sulfate (Sigma) . The beads were centrifuged again and the 
supernatant removed. The coupled beads were resuspended in 100 uL of 0.1 M MES pH 
4.5. Bead concentrations were then determined by diluting each preparation 100-fold 
in ddH 2 0 and enumerating using a Neubauer BrightLine Hemacytometer. Coupled beads 
were stored as individual populations at 8°°C protected from light. 

The relative oligonucleotide probe density on each bead population was 
assessed by Terminal Deoxynucleotidyl Transferase (TdT) end-labelling with biotin- 
ddUTPs. TdT was used to label the 3' -ends of single-stranded DNA with a labeled 
ddNTP. Briefly, 180 uL of the pool of 100 bead populations (equivalent to about 4000 
of each bead type) to be used for hybridizations was pipetted into an Eppendorf tube 
and centrifuged. The supernatant was removed, and the beads were washed in lx TdT 
buffer. The beads were then incubated with a labelling reaction mixture, which 
consisted of 5x TdT buffer, 25mM CoCl 2 , and 1000 pmol of biotin-16-ddUTP (all 
reagents were purchased from Roche) . The total reaction volume was brought up to 
85.5 uL with sterile/ distilled H 2 0, and the samples were incubated in the dark for 1 
hour at 37 °C. A second aliquot of enzyme was added, followed by a second 1 hour 
incubation. Samples were run in duplicate, as was the negative control, which 
contained all components except the TdT. In order to remove unincorporated biotin- 
ddUTP, the beads were washed 3 times with 200 |iL of hybridization buffer, and the 
beads were resuspended in 50 uL of hybridization buffer following the final wash. 
The biotin label was detected spectrophotometrically using SA-PE (streptavidin- 
phycoerythrin conjugate) . The streptavidin binds to biotin and the phycoerythrin is 
spectrally distinct from the probe beads. The lOmg/mL stock of SA-PE was diluted 
100-fold in hybridization buffer, and 15 uL of the diluted SA-PE was added directly 
to each reaction and ' incubated for 15 minutes at 37°Celsius. The reactions were 
analyzed on the Luminex 100 LabMAP. Acquisition parameters were set to measure 100 
events per bead using a sample volume of 50 fiL. 

The results obtained are shown in Figure 6 . As can be seen the Mean 
Fluorescent Intensity (MFI) of the beads varies from 840.3 to 3834.9, a range of 
4.56-fold. Assuming that the labelling reactions are complete for all of the 
oligonucleotides, this illustrates the signal intensity that would be 



obtained for each type of bead at this concentration if the target (i.e., 
labelled complement) was bound to the probe sequence to the full extent 
possible. 

The cross-hybridization of targets to probes was evaluated as follows. 
100 oligonucleotide probes linked to 100 different bead populations, as 
described above, were combined to generate a master bead mix, enabling 
multiplexed reactions to be carried out. The pool of microsphere-immobilized 
probes was then hybridized individually with each biotinylated target. Thus, 
each target was examined individually for its specific hybridization with its 
complementary bead- immobilized sequence, as well as for its non-specific 
hybridization with the other 99 bead-immobilized universal sequences present 
in the reaction. For each hybridization reaction, 25 ^iL bead mix (containing 
about 2500 of each bead population in hybridization buffer) was added to each 
well of a 96-well Thermowell PCR plate and equilibrated at 37 °C. Each target 
was diluted to a final concentration of 0.002 fmol/jaL in hybridization buffer, 
and 25 u-L (50 fmol) was added to each well, giving a final reaction volume of 
50 \xL. Hybridization buffer consisted of 0.2 M NaCl, 0.1 M Tris, 0.08% Triton 
X-100, pH 8.0 and hybridizations were performed at 37 °C for 30 minutes. Each 
target was analyzed in triplicate and six background samples (i.e. no target) 
were included in each plate. A SA-PE conjugate was used as a reporter, as 
described above. The 10 mg/mL stock of SA-PE was diluted 100-fold in 
hybridization buffer, and 15 \iL of the diluted SA-PE was added directly to 
each reaction, without removal of- unbound target, and incubated for 15 
minutes at 37°C. Finally, an additional 35 of hybridization buffer was 
added to each well, resulting in a final volume of 100 [iL'per well prior to 
analysis on the Luminex 100 ' LabMAP. Acquisition parameters were set to measure 
100 events per bead using a sample volume of 80 u-L. 

The percent hybridization was calculated for any event in which the NET 
MFI was at least 3 times the zero target background. In other words, a 
calculation was made for any sample where (MFI sample -MFI 2ero target background) /MFI zero 

target background ^ 3 . 

The net median fluorescent intensity (MFI sam pi e -MFI 2 ero target background) 
generated for all of the 10,000 possible target/probe combinations was 
calculated. As there are 100 probes and 100 targets, there are 100 x 100 = 
10,0000 possible-different interactions possible of which 100- are the result 
of perfect hybridizations. The remaining 9900 result from hybridization of a 
target with a mismatched probe. A cross-hybridization event is then defined 
as a non-specific event whose net median fluorescent intensity exceeds 3 
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times the zero target background. In other words, a cross -talk calculation is only be 
made for any sample where .(MFI sanp i e -MFI 2ero target background ) /MFI zero 

target background > 3 . Cross- 

hybridization events were quantified by expressing the value of the cross - 
hybridization signal as a percentage of the perfect match hybridization signal with 
5 the same probe. 

The results obtained are illustrated in Figure 7. The ability of each target 
to be specifically recognized by its matching probe is shown. Of the possible 9900 
non-specific hybridization events that could have occurred when the 100 targets were 
each exposed to the pool of 100 probes, 6 events were observed. Of these 6 events, 

10 the highest non-specific event generated a signal equivalent to 5.3% *of the signal 
observed for the perfectly matched pair (i.e. specific hybridization event) . 

Each of the 100 targets was thus examined individually for specific 
hybridization with its complement sequence as incorporated onto a microsphere, as 
well as for non-specific hybridization with the complements of the other 99 target 

15 sequences. Representative hybridization results for target (complement of probe 90, 
Table II) are shown in Figure 8. Probe 90 was found to hybridize only to its 
perfectly-matched target. No cross -hybridization with any of the other 99 targets 
was observed. 

The foregoing results demonstrate the possibility of incorporating the 1168 
20 sequences of Table II, or any subset thereof, into a multiplexed system with the 

expectation that most if not all sequences can be distinguished from the others by 
hybridization. That is, it is possible to distinguish each target from the other 
targets by hybridization of the target with its precise complement and minimal 
hybridization with complements of the other targets. 

25 

EXAMPLE 5 - Tag sequences used in sorting polynucleotides 

The family of non cross hybridizing sequence tags or a subset thereof can be 
attached to oligonucleotide probe sequences during synthesis and used to generate 
amplified probe sequences. In order to test the feasibility of PCR amplification 

30 with non cross hybridizing sequence tags and subsequently addressing each respective 
sequence to its appropriate location on two-dimensional or bead arrays, the following 
experiment was devised. A 24mer tag sequence can be connected in a 5' -3' specific 
manner to a p53 exon specific sequence (20mer reverse primer) . The connecting p53 
sequence represents the inverse complement of the nucleotide gene sequence. To 

35 facilitate the subsequent generation of single stranded DNA post-amplification the 
tag-Reverse primer can be synthesized with a phosphate 
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modification (P0 4 ) on the 5' -end. A second PCR primer can also be 
generated for each desired exon, represented by the Forward (5' -3') 
amplification primer, 'in this instance the Forward primer can be labeled 
with a 5'-biotin modification to allow detection with Cy3-avidin or 
equivalent. 

A practical example of the aforementioned description is as follows: 
For exon 1 of the human p53 tumor suppressor gene sequence the following 
tag-Reverse primer (SEQ ID N0:.1171) can be generated: 

222087 222063 
5' -V04-ATGTTAAAGTAAGTGTTGAAATGT -TCCAGGGAAGCGTGTCACCGTCGT-3 ' 
Tag Sequence # 3 Exon 1 Reverse 

The numbering above the Exon-1 reverse primer represents the genomic nucleotide 
positions of the indicated bases. 

The corresponding Exbn-1 Forward primer sequence (SEQ ID NO: 1172) is as follows: 

221873 221896 
5 ' -Bio tin-TCATGGCGACTGTCCAGCTTTGTG-3 ' 

In combination, these primers will amplify a product of 214 bp plus a 24 
bp tag extension yielding a total size of 238 bp. 

Once amplified, the PCR product can be purified using a QIAquick PCR 
purification kit and the resulting DNA can be quantified. To generate single 
stranded DNA, the DNA is subjected to ).-exonuclease digestion thereby resulting 
in the exposure of a single stranded sequence (anti-tag) complementary to the 
tag-sequence covalently attached to the solid phase array. The resulting 
product is heated to 95°C for 5 minutes and then directly applied to the array at 
a concentration of 10-50 nM. Following hybridization and concurrent sorting, 
the tag-Exon 1 sequences are visualized using Cy3 - streptavidin . In addition to 
direct visualization of the biotinylated product, the product itself can now act 
as a substrate for further analysis of the amplified region, such as SNP 
detection and haplotype determination. 

The Invader Assay is described in detail in US Patent No. 5,846, 717 and 
5,985,557. Briefly, the ability of the Invader technology to identify target 
nucleic acid sequences and in particular single base pair changes is dependent 
on the proper structure being formed, followed by subsequent 
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recognition and cleavage of this structure by the Cleavase enzyme. For 
recognition by Cleavase III, the' target sequence must be complementary to the 
primary probe, and. there must be at least a 1 base "invasion" (overlap) of this 
structure by an upstream oligonucleotide. Cleavable M flaps' can be created by 
invasion of an upstream oligonucleotide without primer extension, and the site 
of cleavage is determined by the extent to which the upstream oligonucleotide 
overlaps the 5' region of the downstream oligonucleotide. Cleavage by the 
Cleavase enzyme is dependent on this invaded structure and is sensitive to 
single-base mismatches is positioned immediately upstream of the cleavage site. 
By adding overlapping pairs of oligonucleotide probes complementary to a 
predetermined region of target DNA, the cleavage of the downstream probes become 
a sensitive indicator of the presence of the target sequence. Further, reaction 
conditions have been established that allow multiple copies of the downstream 
oligonucleotide probe to be cleaved for each target sequence without temperature 
cycling, so as to amplify the cleavage signal and allow quantitative detection 
of target DNA at sub-attomole levels. Incorporation of the minimally cross- 
hybridizing sequences of the invention described herein 'into the probe that will 
be cleaved by the Cleavase enzyme allows detection of multiple target DNA 
sequences in a single experiment. 

DEFINITIONS 

Non-cross -hybridization: Describes the absence of hybridization between 
two sequences that are not perfect complements of each other. 

Cross -hybridization : The hydrogen bonding of a single -stranded DNA 
sequence that is partially but not entirely complementary to a single- 
stranded substrate. 

Homology or Similarity: How closely related two or more separate strands 
of DNA are to each other, based on their base sequences. 

Analogue: The symbols A, G, T/U, C take on their usual meaning in the art 
here. In the case of T and U, a person skilled in the art would 
understand that these are equivalent to each other with respect to the 
inter-strand hydrogen-bond (Watson-Crick) binding properties at work in 
the context of this invention. The two bases are thus interchangeable and 
hence the designation of T/U. A chemical, which resembles a nucleotide 
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base is an analogue thereof. A base that does not normally appear in DNA 
but can substitute for the ones, which do, despite minor differences in 
structure. Analogues particularly useful in this invention are of the 
naturally occurring bases can be inserted in their respective places where 
5 desired. Such an analogue is any non-natural base, such as peptide 

nucleic acids and the like that undergoes normal Watson-Crick pairing in 
the same way as the naturally occurring nucleotide base to which it 
corresponds . 

10 Complement: The opposite or "mirror" image of a DNA sequence. A 

complementary DNA sequence has an "A" for every "T" and a "C" for every 
"G". Two. complementary strands of single 'stranded DNA, for example a tag 
sequence and its complement, will join to form a double -stranded molecule. 

15 Complementary DNA (cDNA) : DNA that is synthesized from a messenger RNA 

template; the single- stranded form is often used as a' proJbe in physical 
mapping. . 

Oligonucleotide: Refers to a short nucleotide polymer whereby the 
20 nucleotides may be natural nucleotide bases or analogues thereof. 

Tag: Refers to an oligonucleotide that can be used for specifically sorting 
analytes with at least one other oligonucleotide that when used together do not 
cross hybridize. 

25 * 

Similar Homology: In the context of this invention, pairs of sequences are 

compared with each other based on the amount of "homology" between the 

sequences. By way of example, two sequences are said to have a 50% "maximum 

homology" with each other if, when the two sequences are aligned side-by-side 

30 .with each other so to obtain the (absolute) maximum number of identically paired 

bases, the number of identically paired bases is 50% of the . total number of 

bases in one of the sequences. (If the sequences being compared are of 

different lengths, then it would be of the total number of bases in the shorter 

of the two sequences.) Examples of determining maximum homology are as follows: 

Example 1: 
* * 

A-A-B-B-C-C 

B.-D-C-D-D-D (2 out of 4 paired bases are the -Same) 
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* * 
A-A-B-B-C-C 

B-D-C-D-D-D (2 out of 3 paired bases are the same) 

In this case, the maximum number of identically paired bases is two and 
there are two possible alignments yielding this maximum number. The total 
number of possible pairings is six giving 33 1/3 % (2/6) homology. The maximum 
amount of homology between the two sequences is thus 1/3 . 

Example 2 : 

* * * 
A-A-B-B-C-A 

A-A-D-D-C-D (3 out of 6 paired bases are the same) 

In this alignment, the number of identically paired bases is three and the 
total number of possibly paired bases is six, so the homology between the two 
sequences is 3/6 (50%) . 

* 

A-A-B-B-C-A 

A-A-D-D-C-D (1 out of 1 paired bases are the same) 

In this alignment, the number of identically paired bases is 1, so the 
homology between the two sequences is 1/6 (16 2/3 %) . 

The maximum homology between these two sequences is thus 50%. 

Block sequence: Refers to a symbolic representation of a sequence of blocks. In 
its most general form a block sequence is a representative sequence in which no 
particular value., mathematical variable, or other designation is assigned to 
each block of the sequence . 

Incidence Matrix: As used herein is a well-defined term in the field of Discrete 
.Mathematics. However, an incidence matrix cannot' be defined without first' 
defining a "graph". In the -method described herein a subset of general graphs 
called simple graphs is used. Members of this subcategory are further .defined 
as follows . 
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A simple graph G is a pair (V, E) where V represents the set of vertices 
of the simple graph and E is a set "of un-oriented edges of the simple graph. An 
edge is defined as a 2 -component combination of members of the set of vertices. 
In other words, in a simple graph G there are some pairs of vertices that are 
connected by an edge. In our application a graph is based on nucleic acid 
sequences generated using sequence templates and vertices represent DNA . 
sequences and edges represent a relative property of any pair of sequences. 

The incidence matrix is a mathematical object that allows one to. describe 
any given graph. For the subset of simple graphs used herein, the simple graph 
G=(V,E), and for a pre-selected and fixed ordering of vertices, V= {v x , v 2 , . . . , v n } , 
elements of the incidence matrix A (G) [a^] are defined by the following rules: 

(1) aij=l for any pair of vertices {v^v-j} that is a member of the set 
of edges; and 

(2) aij = 0 for any pair of vertices {v i# Vj}that is not a member of the 
set of edges. 

This is an exact unequivocal definition of the incidence matrix. In effect, one 
selects the indices: l,2,...n of the vertices and then forms an (n x n) square 
matrix with elements aij = l if the vertices Vi and Vj are connected by an edge and 
aij = 0 if the vertices v t and Vj are not connected by an edge. 

To define the term "class property" as used herein, the term 
"complete simple graph" or "clique" must first be defined. The complete simple 
graph is required because all sequences that result from the method described 
herein should collectively share the relative property of any pair of sequences 
defining an edge of graph G, for example not violating the threshold rule that 
is, do not have a "maximum simple homology" greater than a predetermined amount, 
whatever pair of the sequences are chosen from the final set. It is possible 
that additional "local" rules, based on known or empirically determined behavior 
of particular nucleotides, or nucleotide sequences, are applied to sequence 
pairs in addition to the basic threshold rule. 

In the language of a simple graph, G= (V, E) , this means in the final 
graph there should be no pair of vertices (no sequence pair) not connected by an 
edge (because an edge means that' the sequences represented by v t and Vj do not 
violate the threshold rule) . 



86 



Because the incidence matrix of any simple graph can be generated by 
the above definition of' its elements, the consequence of defining a simple 
complete graph is that the corresponding incidence matrix for a simple complete 
graph will have all off-diagonal elements equal to 1 and all diagonal elements 
equal to 0. This is because if one aligns a sequence with itself, the threshold 
rule is of course violated, and all other sequences are connected by an edge. 

For any simple graph, there might be a complete subgraph. First, 
the definition of a subgraph of a graph is as follows. The subgraph Gs=(Vs,Es) 
of a simple graph G=(V,E) is a simple graph that contains the subsets of 
vertices- Vs of the set V of vertices and inclusion of the set Vs into the set V 
is immersion (a mathematical term) . This means' that one generates a subgraph 
Gs=(Vs,Es) of a simple graph G in two steps. First select some vertices Vs from 
G. Then select those edges Es from G that connect the chosen vertices and do 
not select' edges that connect selected with non selected vertices. 

We desire a subgraph of G that is a complete simple graph. By using 
this property of the complete simple graph generated from the simple graph G of 
all sequences generated by the template based algorithm, the pairwise property 
of any pair of the sequences (violating/non-violating the threshold rule) is 
converted into the property of all members of the set, termed "the class 
property" . 

By selecting a subgraph of a simple graph G that is a complete 
simple graph, this assures that, up to the tests involving the local rules 
described herein, there are no pairs of sequences in the resulting set that 
violate the threshold rule, also described above, independent of which pair of 
sequences in the set are chosen. This feature is called the "desired class 
property" . 

The present invention thus includes reducing the potential for non 
cross-hybridization behavior by taking into account local homologies of 
the sequences and appears to have greater rigor than known approaches . 
For example, the method described herein involves the sliding of one 
sequence relative to the other sequence in order to form a sequence 
alignment that would accommodate insertions or deletions. (Kane et al . , 
Nucleic Acids Res.; 28, 4552-4557: 2000). 



87 



Table I 

SEQ ID NO(l) Sequence No. Assigned in Example 3 



1 


GATTTGTATTGATTGAGATTAAAG 


1 


2 


TGATTGTAGTATGTATTGATAAAG 


2 


3 


GATTGTAAGATTTGATAAAGTGTA 


3 


4 


GATTTGAAGATTATTGGTAATGTA 


4 


5 


GATTGATTATTGTGATTTGAATTG 


5 


6 


GATTTGATTGTAAAAGATTGTTGA 


6 


7 


" ATTGGTAAATTGGTAAATGAATTG 


7 


8 


ATTGGATTTGATAAAGGTAAATGA 




9 


GTAAGTAATGAATGTAAAAGGATT 


3 


10 


GATTGATTGATTGATTGATTTGAT 




11 


TGATGATTAAAGAAAGTGATTGAT 




12 


AAAGGATTTGATTGATAAAGTGAT 




13 


TGTAGATTTGTATGTATGTATGAT 

X VJ X xXv_J jTi XXX V-J X X V-J X li X VJ X jti x vjrx X 


10 


14 


GATTTGATAAAGAAAGGATTGATT 




15 


GATTAAAGTGATTGATGATTTGTA 

vj ** x x iUui\j x v— ' n x x vj.ii, x v-j ax x x x vj x xi. 


11 


16 


AAAGAAAGAAAGAAAGAAAGTGTA 


12 


17 


TGTAAAAGGATTGATTTGTATGTA 

x v_ J x fMiAn\j vjxT. x x wri x x x w x x \ x vj x xa 




18 


AAAGTGTAGATTGATTAAAGAAAG 

nruiW x vj x nun x x vjn x x rmnx-innnNj 




19 


AAAGTTGATTGATTGAAAAGGTAT 

xi-iiAX^J J- J- v- (.n. x x vjri x x \jnnrm\j vj x n. x 




20 


TTGATTGAGATTGATTTTGAGTAT 

X X vj/tl x X vjfivj,r\ x X vJrTi X x X X vJriVj X f\ X 




2 1 


TGAATTGATGAATGAATGAAGTAT 

x vjrUT. x x x vjriui x \Jc^n. x v_j.r^fAv_i x n x 


X D 


22 


GTAATGAAGTATGTATGTAAGTAA 

vj x rifTi x unnu x -n. x vj x n x vj x n-rivj x nri 




23 


TGATGATTTGAATGAAGATTOATT 

X VJJfTL X uA XXX X "JiUiVJrt X X V-J-fi X X 


X o 


24 


TGATAAAGTGATAAAGGATTAAACl 

luninnnulUnlfUinOunl lndfirtv 


1 7 

x / 


2 5 


THATTTGAGTATTTnAGATTTTftA 

x vj^t. xxx vj.t^vj x xxx vjnvjn x x x x vjr^i 


1 ft 
X o 


26 


TGTAGTAAGATTGATTAAAGGTAA 

x vj x x ruiuxi x x vjri x x itjtxtvjvj x nr\ 




2 7 


GTATAAAGGATTGATTTTGAAAAG 

vj x -tV x nnnvj vjx^ x x vjn x x x x vjfififvfivj 




2 8 


GTATTTGAGTAAGTAATTGATTGA 

VJ X ii XXX VjriVJ X JTJT.VJ X X X VJ^Tl X , X VJ.fi 


19 


29 


GTAAAAAGTTGAGTATTGAAAAAG 




3 0 


GATTTGATAAAGGATTTGTATTGA 




3 1 


GATTGTATTGAAGTATTGTAAAAG 

vjx^, X X VJ X Xi. X X NJiMiV X xi X X VJ X iTirT/ui\J 


20 


32 


TGATGATTTTGATGAAAAAGTTGA 




33 


TGATTTGAGATTAAAGAAAGGATT 

x vjn xxx v—j^xv-j^i. x x rxruiv **^x^.w\Jii x x 


2 1 


34 


TGATTGAATTGAGTAAAAAGGATT 


22 


35 


AAAGTGTAAAAGGATTTGATGTAT 




36 


AAAGGTATTTGAGATTTGATTGAA 




37 


AAAGTTGAGATTTGAATGATTGAA 


23 


38 


TGTATTGAAAAGGTATGATTTGAA 




39 


GTATTGTATTGAAAAGGTAATTGA 


24 


40 


TTGAGTAATGATAAAGTGAAGATT 

x x vjx*^-j x iUi x vii x ruuiv x vjrxnuri x x 




41 


TGAAGATTTGAAGTAATTGAAAAG 


25 


42 


TGAAAAAGTGTAGATTTTGAGTAA 


26 


43 


TGTATGAATGAAGATTTGATTGTA 




44 


AAAGTTGAGTATTGATTTGAAAAG 


27 


45 


GATTTGTAGATTTGTATTGAGATT 




46 


AAAGAAAGGATTTGTAGTAAGATT 


29 


47 


GTAAAAAGAAAGGTATAAAGGTAA 


30 


48 


GATTAAAGTTGATTGAAAAGTGAA 


31 


49 


TG AAAAAGGT AATTG ATGT ATGAA 




50 


AAAGGATTAAAGTGAAGTAATTGA 


33 
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5 1 


ATGAATTGGTATGTATATGAATGA 

£x j_ vjnn x x vj vj x x vj x x x vJait* x \j** 


34 


52 


TGAAATGAATGAATGATGAAATTG 


35 


j -j 


ATTGATTGTGAATGAAATGAATTG 

x X vJ^i x x vj x uiui x vjrvfirTv x vjnn x x vj 


3 6 


54 


ATTGAAAGATGAAAAGATGAAAAG 


37 


55 


ATTGTTGAAAAGTGTAATGATTGA 

xx X X vj X X wrxXTTlriVJ X vj X xxxx. X vjri X X w*v 


38 


56 


ATGATGTAATGAAAAGATTGTGTA 

xi x vjr* x \j x nn x wi^riririvjri x j, vj jl vj x .fx 


3 9 


-j 1 


AAAGATTGAAAGATGATGTAATTG 

riTuiVjii x x vjrTrx/T.\jr^. x x vj x iui x x vj 




c. o 
j o 


ATTGATGAGTATATTGTCTAGTAA 

.rt x J. vjrt. x vjrt.vj x a x -rt. x x vj -W.vj x rtvj x .rtrt. 


*± X 


-J 


AAAGATTGTGTAATTGATGATGAA 

rtrtrt.vjrt, X X vj x vj x ii-rt. X X vjrt X VJri X vjrt^rt 






AAAGGTATATTGTGTAATGAGTAA 

rtrtrt. vjvj X rt X rt X X vj X vj x rtjr\ X Oriu X J^rt 




O X 


TGTAATGAGTATTGTAATTGAAAG 

X VJ X rtjrt X \Jr\VJ X rt. X X VJ X r\Tl X X VJrtJTJ^VJ 


43 


£9 


GTATAAAGAAAGATTGGTAAATGA 

vj X xt. X rtrtrt.vj rtrtrt. vjrti X X vjvj X rtrtrt X vjrt 




D J 


TTGAGTAATTGAATTGTGAAATGA 

X X Vjrtvj X rtrt. X X vjrtrt. X X VJ X VJ rtrtrt. X VJrt 


4 ^ 


£4 


T(~ZT A TTG A A TG A A TTGTTG A TflT A 
lOlnl X vjrtrt. X Vjrtrt. X X Vj X X Vjrt. 1U In. 


4 <^ 


o o 


THIT A L\TTf^GTA A ATGAGTA A A A ACl 
lOl .rtrt. X X vjvj X rtrtrt. X vj.rt.vj X rtrtrtrtrtVj 




O D 


TGAATGAAATTGATCiAGTATAA AG 

X Vjrtrt X VjrtrtJA X X VJrt, X vjrtvj X rt. X rtrt.rt.vj 






(TTAAGTAA ATTGA AAGATTGATGA 

vj X rtrtvj X rtrtrt X J. vjrt-rtrt.vj.rt. X X vjrt X vjrt 


4 Q 


ft 
D O 


GTAAATGATGATATTGGTATATTG 

vj X rtrtrt. X vjrt. X VJrt. X rt. X X Vjvj X rt. X rt. X X vj 


J VJ 




ATTnTTHATn ATTflATTf^ A A ATHIA 
rt X X vj X X vjrt. X Vjrt 1 X Vjrt X X kj rtrtrt. X vjrt 


D X 


/ u 


ATT^THA AnT ATA A AGATfiATTHA 
nl lvjl vjrtrt.vj X rt. ± rt-rtrtvjrt. X vjrt. X X Vjrt. 




/ X 


ZiTnA A A AHTTnAriTA A ATTGTfi AT 
rt X vjrtrtrtrt.vj X X Vjrtvj X rtrtrt. X X Vj X VJrt. X 




/ /a 


a Trc i\ w TTr^t a l\ i\ citci a tt^ a a a a a ci 

rt. X vjrtrt. X X VjrtrtrtVj X vjrt X X wrt_rtrtrtrtiAJ 


j 


n ~x 


HTA A ATTnATfiA A A AHTTnATflAT 
VJ X rtrtrt. X X vjrt X Vjrtrtrtrtvj X X vjrt X Vjrt. X 




1 A 


rtrtrt. vj 1 vjrt 1 vj 1 rt. 1 rt. 1 vjrtvj 1 rtrtrt. 1 lb 


o o 


/ D 


HT A Z\ TH IV T A A A H, A Tfi A TH A T A TTf^ 
Vj X -Mrt. X Vjrt X rtrtrt VJrt. X Vjrt. X vjrt X rt X X VJ 


o / 


"7 £ 
/ D 


TTr* A A A A.riATTnflTA ATniVTZiTniV 
X 1 vjrtrtrtrtvjrt X X Vjvj X rtrt X vjrt 1 rt X Vjrt 




1 1 

1 f 


A A ACTH.A AAA AaATTClAT'Tn.ATn.A 
rtrtrt,Vj X vjrt>irtrtrtvjrt X X vjrt X X vjrt X vjrt 


c; q 


1 P 
/ 0 


rtX X vjrt X vjrt. vjrt 1 i vjrt. 1 X rt X X vj 1 vj X rt 




1 Q 
f J 


ATGAGATTATTGGATTTGTAGATT 

rt X VjrtVjrt. X X rt X X vjVjrt X X X Vj X rtVJrt X X 


o u 




TGA AGATTATGAATTGGTA AG ATT 

X Vjrtrt vjrt. X X rt X vjrtrt. X X vjvj X rtrt vjrt. X X 


D X 


R 1 


a ttcicz a tt a t^i ana TT A TCi A TTn A 

rt X X VjVjrt X X rt. X vjrtvjrt X X rt. X Vjrt. X X vjrt 


D <i 


R 9 


aTTHTTHA ATTGH ATT A A AG ATG A 

rt X X vj X X Vjrtrt. X XVjVjrtX X rtrtrt. vjrt X VJrt 




ft ^ 


A A AGATGAGTA AGTAA ATTGGATT 

rtrtrt vjrt. X vjrtvj X rtrtvj X rtrtrt. X X VjVjrt X X 




ft 4 


A A AGGTAAGATTATTGATGAAAAG 

rtrtrt vjvj X rtrt. vjrt. X X rt. X X vjrt. X Vjrtrtrtrt.vj 


O J 


ft R 


ATTGATGAGATTAAAGTTGAATTG 
rt x x vjrt. x vjrt. vjrt. x ± .rtrtrtvj x x vjrtrt. x x vj 




ft 

o o 


GATTATTGGATTATGAAAAGGATT 

VJrt X X-rtX XvJVJrtX X rt. X VJrtrtrtirtVJVJrt X X 




ft 7 
o / 


HATTTGTAATTGTTGAGTAAATGA 

vjrt X X X VJ X rtrt. X X vj X X vjrt.vj X rtrtrt X Vjrt 


fi7 

VJ / 


ft ft 


AAAGAAAGATTGTTGAGATTATGA 

rtrtrt. Vjrtrtrt VJrt X X Vj X X Vjrt. Vjrt. X X rt X VJrt 


Gft 


ft 9 


GTATAAAGGATTTTGAATTGATGA 

VJ X £± X rtrtrtVJVJrt X X X X VJrturt X X VJrt. X VJrt, 




^ u 


TTGAGATTGTAAATGAATTGTTGA 

X X vjrt.vj.rt. x x vj x rvLM x vjnn x x vj x x vjrt. 




91 


GTATATTGATTGTGTAATGAAAAG 

VJ X A X -C^. X X .xl X X \J X VJ X xi_^i X VJii^iiTxiVJ 




92 


TGATATGAATTGGATTATTGGTAT 


70 


93 


ATGAATGATGAATGATGATTATTG 

.rt x vjrt^i. x x wn^i x vjx^ x vjri. x x n x x vj 




94 


ATG AATTGATTGG ATTGTAATG AT 


71 


95 


GATTGTAATTGAGTAAATTGATGA 

vjn. x x vj x rui x x urvj x x x vj^x x vjrx 




96 


GATTATTGGATTAAAGGTAAATGA 

\J£x X X x\. X X VJ VJ^X X X iMuiVJ VJ X 4UUi X VJii 


72 


J7 / 


aTTnTTna aTTnaTnanaTTTnaT 

rt X X Vj X X vjrtrt. X X vjrt. X vjrtvjrt. XXX vjrt. X 


7 7 
/ J 


98 


GATTATGAGTAAATTGATTGTGAT 




99 


GATTATTGTTGATGAATGATATTG 




100 


TGTAAAAGATTGAAAGGTATGATT 


75 


101 


GTATTTAGATGAGTTTGTTAGATT 


76 


102 


' TGAAGTTATGTAATAGAAAGTGAT 




103 


GTATGTATTGTATGTAGTTAATTG 


77 


104 


TGATATAGATAGTTAGATAGATAG 


78 


105 


ATGATGATGTATTGTAGTTATGAA 


79 



89 



106 TTAGTGAATGTATTAGTTGATGTA 

107 GTTAGTTAGATTATTGTTAGTTAG 8 0 
10 8 GTTAATTGTGTAGTTTGTTATTGA 

109 GTTATGAAATAGTGATATTGTTAG 

110 ATTGTTAGAAAGTGTAGATTAAAG 81 

111 ' ATGAGTATGTTATTAGTGTATGTA 82 

112 TGTAATAGTGAAGTTAGATTGTAT 83 

113 ATTGATAGATGATTAGTTAGTTGA 84 

114 ATGAGTTTGTTTATGAGATTAAAG 

115 TGATGTTTGATTATGATGTAGTAT . 8 5 

116 ATGAGTTAGTTATGAATTAGATGA 

117 ATTGTTAGTGATGTTAGTAATTAG 8 6 

118 TGATGTAAGTATTGATGTTAGTTT 8 7 

119 GATTGTAAATAGAAAGTGAAGTAA 88 
12 0 ATTGTGTATGAAGTATTGTATGAT 

121 ATAGTGATGTTATGAAGATTGTTA . . 

122 ■ TTAGATGAATTGTGAAGTATTTAG .90 
12 3 GTAAGTTATGATTGATGTTATGAA 91 

124 . GTATTGATGTTTAAAGTGTAATAG 92 

125 GATTGTAAGTAAGATTGTATATTG 

12 6 GTTTGTATTTAGATGAATAGAAAG 93 

.127 GTTTGATTTGTAATAGTGATTGTA 

128 TGTATGTAGTATTTAGAAAGATGA 

12 9 ATGAATTGTGATAAAGAAAGTTAG 

13 0 TTAGTGTAGTAAGTTTAAAGTGTA ■ 95 
131 GTATGATTGTTTGTAATTAGTGAT 

13 2 GTTTAAAGTTAGTTGAGTTAGTAT . 96 

13 3 ATAGTGTATGTAGATTATGAGATT 97 

■ 134 TTGAATGATTAGTTGAGTATGATT 98 

135 GTATGTAAGTTAGTATGATTTGAA 

13 6 TGTAGTATATTGTTGAATTGTGAT 

13 7 ATAGTGATTGTATGTATGATAAAG ' 

13 8 TTAGTGATTGATGTATATTGAAAG 

13 9 GTAAGATTATGAGTTATGATGTAA 

14 0 GTTATG AAATTGTTAGTGT AG ATT 9 9 

141 GTTAG ATTTGT AGTTTAAAG AT AG 100 

142 TTAGTGATTGAAATGATGTAGATT 

143 AAAGTGTAGTTATTAGTT AGTTAG 
14 4 AAAG AAAGTGTATGATGTT ATT AG 
14 5 GATTGTATATTGTGTATG ATG ATT 
14 6 TTGAG ATTGTTATG ATATGAGT AT 
14 7 • ATGAGTATGATTGTTATGATGTTT 
14 8 TGATTTAGTGAAATTGTGTATTAG ' 

14 9 TGAATGTATGTAGTATGTTTGTTA 

150 GTTAGTATTGATGATTATGAGTTA 

151 GTATATTGTGATTTAGTTGAGATT 

152 GTTAGTTTAAAGTTGAGATTGTTT 

153 GTATATTGTTAGATGAGATTTGTA ' 

154 TGATGTATGTTAGTTTATGAATGA 

155 . TGTAGTATGTAATGTAGTATTTGA 

156 ATGAGTTATGTATTGAGTTAGTAT 

157 TGTATGATGATTATAGTTGAGTAA 

15 8 ATTGATGAATGAGTTTGTATAAAG 

159 ■ TTGAGTTTATGATTAGAAAGAAAG 

160 TGATATTGATGAGTTAGTATTGAA 



90 



161 


ATAGAAAGTGAAATGAGTATGTTA 




162 


TTGATGTAGATTTGATGTATATAG 




163 


TTGAGATTATAGTGTAGTTTATAG 




164 


TGATGTTAGATTGTTTGATTATTG 




165 


TGTATTAGATAGTGATTTGAATGA 




166 


GATTATGATGAATGTAGTATGTAA 




167 


TGAATGATTGATATGAATAGTGTA 




168 


GTAATGATTTAGTGTATTGAGTTT 




169 


TGTAGTAATGATTTGATGATAAAG 




170 


TGAAGATTGTTATTAGTGATATTG 




171 


GTATTTGAATGATGTAATAGTGTA 




172 


GTATATGATGTATTAGATTGAAAG 




173 


AAAGTTAGATTGAAAGTGATAAAG 




174 


GTAAGATGTTGATATAGAAGATTA 


9 


175 


TAATATGAGATGAAAGTGAATTAG 




176 


TTAGTGAAGAAGTATAGTTTATTG • 


13 


177 


GTAGTTGAGAAGATAGTAATTAAT 




178 


ATGAGATGATATTTGAGAAGTAAT 




179 


GATGTGAAGAAGATGAATATATAT 




180 


AAAGTATAGTAAGATGTATAGTAG 


14 


181 


GAAGTAATATGAGTAGTTGAATAT 




182 


TTGATAATGTTTGTTTGTTTGTAG 


2 8 


183 


TGAAGAAGAAAGTATAATGATGAA 




184 


GTAGATTAGTTTGAAGTGAATAAT 




185 


TATAGTAGTGAAGATGATATATGA 

x x riu x rxu x unnun x vj^x x -T^ x x vjfi 




18 6 


TATAATGAGTTGTTAGATATGTTG 

x .r^ x ruT. x x x \j x x .£T.vjr^. x *x x vj x x vj 




187 


GTTGTGAAATTAGATGTGAAATAT 




188 


TAATGTTGTGAATAATGTAGAAAG 




189 


GTTTATAGTGAAATATGAAGATAG 


42 


190 


ATTATGAAGTAAGTTAATGAGAAG 


4 7 


191 


GATGAAAGTAATGTTTATTGTGAA 




192 


ATTATTGAGATGTGAAGTTTGTTT 

ri X X jTi. X X ^jrVUix X X VJiiiV\J XXX \_4 xxx 


4 8 


193 


TGTAGAAGATGAGATGTATAATTA 

x v_j x rxwrUiVj** x yn\jri x v-j x x \ x x xx 


53 


194 


TAATTTGAGTTGTGTATATAGTAG 




195 


TGATATTAGTAAGAAGTTGAATAG 




196 


GTTAGTTATTGAGAAGTGTATATA 


55 


197 


GTAGTAATGTTAATGAATTAGTAG 


58 


198 


GTTTGTTTGATGTGATTGAATAAT 




199 


GTAAGTAGTAATTTGAATATGTAG 


64 


200 


GTTTGAAGATATGTTTGAAGTATA 




201 


ATGATAATTGAAGATGTAATGTTG 




202 


GTAGATAGTATAGTTGTAATGTTA 

VJ X liWil X iiVJ X X fiU X X V_* X flA X VJ X X 


o o 


203 


GATGTGAATGTAATATGTTTATAG 


69 


204 


TGAAATTAGTTTGTAAGATGTGTA 


74 


205 - 


. TGTAGTATAAAGTATATGAAGTAG . 


. 63 


206 


ATATGTTGTTGAGTTGATAGTATA 


.89 


207 


ATTATTGAGTAGAAAGATAGAAAG 


. 94 


208 


GTTGTTGAATATTGAATATAGTTG 




209 


ATGAGAAGTTAGTAATGTAAATAG 




210 


• TGAAATGAGAAGATTAATGAGTTT 





91 



Table II 

Sequence, SEQ ID NO: No. in 

Ex 4 



A 


A 


A 


T 


T 


G 


T 


G 


A 


A 


A 


G 


A 


T 


T 


G 


T 


T 


T 


G 


T 


G 


T 


A 


1 


1 


G 


T 


T 


A 


G 


A 


G 


T 


T 


A 


A 


T 


T 


G 


T 


A 


T 


. T 


T 


/-i 

Cj 


A 


T 


G 


A 


2 




A 


T 


G 


T 


T 


A 


A 


A 


G 


T 


A 


A 


G 


T 


G 


-T 


T 


G 


A 


A 


A 


T 


G 


T 


3 




T 


G 


A 


T 


G 


T 


T 


A 


G 


A 


A 


G 


T 


A 


T 


A 


T 


T 


G 


T 


G 


A 


A 


T 


4 




T 


T 


T 


G- 


T . 


G 


.T 


A 


G 


A 


A 


T 


A 


T 


G 


T 


G 


T 


T 


G 


T 


T 


A 


A 


5 . 




A 


T 


A 


A 


G 


T 


G 


T 


A 


A 


G 


T 


G 


A 


A 


A 


T 


A 


A 


G 


A 


A 


G 


A 


6 




A 


A 


G 


A 


G 


T 


A 


T 


T 


T 


G 


T 


T 


G 


T 


G 


A 


G 


T 


T 


A 


A 


A 


T 


7 


" 


G 


T 


G 


T 


T 


T 


A 


T 


G 


T 


T 


A 


T 


A 


T 


G 


T 


G 


A 


A 


G 


T 


T 


T 


8 




A 


A 


A 


G 


A 


G 


A 


A 


T 


A 


G 


A 


A 


T 


A 


T 


G 


T 


G 


T 


A 


A 


G 


T 


9 




T 


A 


T 


G 


A 


A 


A 


G 


A 


G 


T 


G 


A 


G 


A 


T 


A 


A 


T 


G 


T 


T 


T 


A 


' 10 




A 


T 


G 


A 


G 


A 


A 


A 


T 


A 


T 


G 


T 


T 


A 


G 


A 


A 


T 


G 


T 


G 


A 


T 


11 




T 


T 


A 


G 


T 


T 


G 


T 


T 


G 


.A 


T 


G 


T 


T 


T 


A 


G 


T 


A 


G 


T 


T 


T 


12 




G 


T 


A 


A 


A 


G 


A 


G 


T 


A 


T 


A 


A 


G 


T 


T 


T 


G 


A 


T 


G 


A 


T 


A 


13 




A 


A 


A 


G 


T 


A 


A 


G 


A 


A 


T 


G 


A 


T 


G 


T 


A 


A 


T 


A 


A 


G 


T 


G 


14 




G 


T 


A 


G 


A 


A 


A 


T 


A 


G 


T 


T 


T 


A 


T 


T 


G 


A 


T 


G 


A 


T 


T 


G 


15 




T 


G 


T 


A 


A 


G 


T 


G 


A 


A 


A 


T 


A 


G 


T 


G 


A 


G 


T 


T 


A 


T 


T 


T 


16 


2 


A 


A 


A 


T 


A 


G 


A 


T 


G 


A 


T 


A 


T 


A 


A 


G 


T 


G 


A 


G 


A 


A 


T 


G 


17 


~ 


A 


T 


A 


A 


G 


T 


T 


A 


T 


A 


A 


G 


T 


G 


T 


T 


A 


T 


G 


T 


G 


A 


G 


T 


18 




T 


A 


T 


A 


G 


A 


T 


A 


A 


A 


G 


A 


G 


A 


T 


G 


A 


T 


T 


T 


G 


T 


T 


G 


19 


. — 


A 


G 


A 


G 


T 


T 


G 


A 


G 


A 


A 


T 


G 


T 


A 


T 


A 


G 


T 


A 


T 


T 


A 


T 


20 


— 


A 


A 


G 


T 


A 


G 


T 


T 


T 


G 


T 


A 


A 


G 


A 


A 


T 


G 


A 


T 


T 


G 


•T 


A 


21 


— 


T 


T 


A 


T 


G 


A 


A 


A 


T 


T 


G 


A 


G 


T 


G 


A 


A 


G 


A 


T 


T 


G 


A 


T 


22 


— 


G 


T 


A 


T 


A 


T 


G 


T 


A 


A 


A 


T 


T 


G 


T 


T 


A 


T 


G 


T 


T 


G 


A 


G 


23 


- 


G 


A 


A 


T 


T 


G 


T 


A 


T 


A 


A 


A 


G 


T 


A 


T 


T 


A 


G 


A 


T 


G 


T 


G 


24 


4 


T 


A 


G 


A 


T 


G 


A 


G 


A 


T 


T 


A 


A 


G 


T 


G 


T 


T 


A 


T 


T 


T 


G 


A 


25 


— 


G 


T 


T 


A 


A 


G 


T 


T 


T 


G 


T 


T 


T 


A 


T 


G 


T 


A 


T 


A 


G 


A 


A 


G 


26 


- 


G 


A 


G 


T 


A 


T 


T 


A 


G 


T 


A 


A 


A 


G 


T 


G 


A 


T 


A 


T 


G 


A 


T 


A 


27 




G 


T 


G 


A 


A 


T 


G 


A 


T 


T 


T 


A 


G 


T 


A 


A 


A 


T 


G 


A 


T 


T 


G 


A 


28 


— 


G 


A 


T 


T 


G 


A 


A 


G 


T 


T 


A 


T 


A 


G 


A 


A 


A 


T 


G 


A 


T 


T 


A 


G 


29 




A 


G 


T 


G 


A 


T 


A 


A 


A 


T 


G 


T 


T 


A 


G 


T 


T 


G 


A 


A 


T 


T' 


G 


T 


30 




T 


A 


T 


A 


T 


A 


G 


T 


A 


A 


A 


T 


G 


T 


T 


T 


G 


T 


G 


T 


G 


T 


T 


G 


31 




T 


T 


A 


A 


G 


T 


G 


T 


T 


A 


G 


T 


T 


A 


T 


T 


T 


G 


T 


T 


G 


T 


A 


G 


32 




G 


T 


A 


G 


T 


A 


A 


T 


A 


T 


G 


A 


A 


G 


T 


G 


A 


G 


A 


A 


T 


A 


T 


A 


33 




T 


A 


G 


T 


G 


T 


A 


T 


A 


G 


A 


A 


T 


G 


T 


A 


G 


A 


T 


T 


T 


A 


G 


T 


34 




T 


T 


G 


T 


A 


G 


A 


T 


T 


A 


G 


A 


T 


G 


T 


G 


T 


T 


T 


G 


T 


A 


A 


A 


. 35 




T 


A 


G 


T 


A 


T 


A 


G 


A 


G 


T 


A 


G 


A 


G 


A 


T 


G 


A 


T 


A 


T 


T 


T 


3 6 




A 


T 


T 


G 


T 


G 


A 


A 


A 


G 


A 


A 


A 


G 


A 


G 


A 


A 


G 


A 


A 


A 


T 


T 


37 


7 


T 


G 


T 


G 


A 


G 


A 


A 


T 


T 


A 


A 


G 


A 


1 


rp 

T 


A 


A 


G 


A 


A 


T 


G 


T 


3 8 




A 


T 


A 


T 


T 


A 


G 


T 


T 


A 


A 


G 


A 


A 


A 


G 


A 


A 


G 


A 


G 


T 


T 


G 


. 3 9 




T 


T 


G 


T 


A 


G 


T 


T 


G 


A 


G 


A 


A 


A 


T 


A 


T 


G 


T 


A 


G 


T 


T 


T 


40 




T 


A 


G 


A 


G 


T 


T 


G 


T 


T 


A 


A 


A 


G 


A 


G' 


T 


G 


T 


A 


A 


A 


T 


A 


41 




G 


T 


T 


A 


T 


G 


A 


T 


G 


T 


G 


T 


A 


T 


A 


A 


G 


T 


A 


A 


T 


A 


T 


G 


' 42 




T 


T 


T 


G 


T 


T 


A 


G 


A 


A 


T 


G 


A 


G 


A 


A 


G 


A 


T 


T 


T 


A 


T 


G 


43 


10 


A 


G 


T 


A 


T 


A 


G 


T 


T 


T 


A 


A 


A 


G 


A 


A 


G 


T 


A 


G 


T 


A 


G 


A 


44 




G 


T 


G 


A 


G 


A 


T 


A 


T 


A 


G 


A 


T 


T 


T 


A 


G 


A 


A 


A 


G 


T 


A 


A 


45 




T 


T 


G 


T 


T 


T 


A 


T 


A 


G 


T 


G 


A 


A 


G 


T 


G 


A 


A 


T 


A 


G 


T 


A 


46 




A 


A 


G 


T 


A 


A 


g' 


T 


A 


G 


T 


A 


A 


T 


A 


G 


T 


G 


T 


G 


T 


T 


A' 


A 


47 




A 


T 


T 


T 


G 


T 


G 


A 


G 


T 


T 


A 


T 


G 


A 


A 


A 


G 


A 


T 


A 


A 


G 


A 


48 




G 


A 


A 


A 


G 


T 


A 


G 


A 


G 


A 


A 


T 


A 


A 


A 


G 


A 


T 


A 


A 


G 


A 


A 


4 9. 




A 


T 


T 


T 


A 


A 


G 


A 


T 


T 


*G 


T 


T 


A 


A 


G 


A 


G 


T 


A 


G 


A 


A 


G 


50 




G 


T 


T 


T 


A 


A 


A 


G 


A 


T 


T 


G 


T 


A 


A 


G 


A 


A 


T 


G 


T 


g" 


T 


A 


51 




T 


T 


T 


G 


T 


G 


A 


A 


G 


A 


T 


G 


A 


A 


G 


T 


A 


T 


T 


T 


G 


T 


A 


T 


52 





92 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



T 


G 


T 


G 


T 


T 


T 


A 


G 


A 


A 


T 


T 


T 


A 


G 


T 


A 


T 


G 


T 


G 


T 


A 


53 


_ 


G 


A 


T 


A 


A 


T 


G 


A 


T 


T 


A 


T 


A 


G 


A 


A 


A 


G 


T 


G 


T 


T 


T 


G 


54 


_ 


G 


T 


T 


A 


T 


T 


T 


G 


T 


A 


A 


G 


T 


T 


A 


A 


G 


A 


T 


A 


G 


T 


A 


G 


55 


_ 


A 


G 


T 


T 


T 


A 


T 


T 


G 


A 


A 


A 


G 


A 


G 


T 


T 


T 


G 


A 


A 


T 


A 


G 


56 




T 


T 


G 


T 


G 


T 


T 


T 


A 


T 


T 


G 


T 


G 


T 


A 


G 


T 


T 


T 


A 


A 


A 


G 


57 


_ 


A 


T 


T 


G 


T 


G 


A 


G 


A 


A 


G 


A 


T 


A 


T 


G 


A 


A 


A 


G 


T 


T 


A 


T 


58 


_ 


T 


G 


A 


G 


A 


A 


T. 


G 


T 


A 


A 


A 


G 


A 


A 


T 


G 


T 


T 


T 


A 


T 


T 


G . 


59 


13 


A 


T 


G 


T 


G 


A 


A 


A 


G 


T 


T 


A 


T 


G 


A 


T 


G 


T 


T 


A 


A 


T 


T 


G 


60 




G 


T 


T 


T 


A 


G 


T 


A 


T 


T 


A 


G 


T 


T 


G 


T 


T 


A 


A 


G 


A 


T 


T 


G 


61 


_ 


G 


A 


T 


T 


G 


A 


T 


A 


T 


T 


T 


G 


A 


A 


T 


G 


T 


T 


T 


G 


T 


T 


T 


G 


62 


14 


T 


G 


A 


A 


T 


T 


G 


A 


A 


A 


G 


T 


G 


T 


A 


A 


<T 


G 


T 


T 


G 


T 


A 


T . ' 


63 




G 


A 


T 


T 


G 


T 


A 


T 


T 


G 


T 


T 


G 


A 


G 


A 


A 


T 


A 


G 


A 


A 


T 


A 


64 


_ 


A 


A 


A 


T 


T 


T 


G 


A 


G 


A 


T 


T 


T 


G 


T 


G 


A 


T 


A 


G 


A 


G 


T 


A 


. 65 


_ 


G 


T 


A 


A 


T 


T 


A 


G 


A 


T 


T 


T 


G 


T 


T 


T 


G 


T 


T 


G 


T 


T 


G 


T 


66 


_ 


G 


T 


T 


T 


G 


T 


A 


T 


T 


G 


T 


T 


A 


G 


T 


G 


A 


A 


T 


A 


T 


A 


G 


T 


67 




A 


T 


G 


T 


A 


G 


T 


A 


G 


T 


A 


G 


A 


T 


G 


T 


T 


T 


A 


T 


G 


A 


A 


T 


68 




T 


G 


T 


T 


T 


A 


A 


A 


G 


A 


T 


G 


A 


T 


T 


G 


A 


A 


G 


A 


A 


A 


T 


G 


69 




T 


G 


T 


G 


A 


T 


A 


A 


T 


G 


A 


T 


G 


T 


T 


A 


T 


T 


T 


G 


T 


G 


T 


A 


70 




A 


T 


A 


G 


T 


T 


G 


T 


G 


A 


G 


A 


A 


T 


T 


T 


G 


T 


A 


A 


T 


T 


A 


G 


71 


_ 


A 


T 


A 


G 


A 


T 


G 


T 


A 


A 


G 


A 


G 


A 


A 


A 


T 


T 


G 


T\ 


G 


A 


A 


A 


72 




A 


G 


A 


T 


T. 


A 


A 


G 


A 


G 


A 


A 


G 


T 


T 


A 


A" 


T 


A 


G 


A 


G 


T 


A 


73 




G 


A 


A 


G 


T 


A 


A 


A 


T 


T. 


G 


T 


G 


A 


A 


T 


G 


A 


A 


A 


G 


A 


A 


A 


74 




A 


A 


T. 


G 


T 


A 


A 


G 


A 


A 


A 


G 


A 


A 


G 


A 


T 


T 


G 


T 


T 


G 


T 


A 


75 




T 


T 


T 


G 


A 


T 


T 


T 


A 


T 


G 


T 


G 


T- 


-'T 


A 


T 


G 


T 


T 


G 


A 


G 


T 


76 




G 


T 


A 


T 


T 


G 


A 


G 


A 


A 


A 


T 


T 


T 


G 


A 


A 


G 


A 


A 


T 


G 


A 


A 


77 


_ 


G 


A 


A 


T 


T 


G 


T 


A 


T 


G 


A 


A 


A 


T 


G 


A 


A 


T 


T 


G 


T 


A 


A 


G 


78 




T 


A 


T 


T 


G 


T 


A 


G 


A 


A 


G 


T 


A 


A 


A 


G 


T 


T 


A 


G 


A 


A 


G 


T . 


79. 




T 


T 


T 


A 


T 


G 


T 


A 


A 


T 


G 


A 


T 


A 


A 


G 


T 


G 


T 


A 


G 


T 


T 


G 


80 




A 


T 


A 


T 


A 


G 


T 


T 


G 


A 


A 


A 


T 


T 


G 


T 


G 


A 


T 


A 


G 


T 


G 


T 


81 




A 


T 


A 


A 


G 


A 


A 


A 


T 


T 


A 


G 


A 


G 


A 


G 


T 


T 


G 


T 


A 


A 


A 


G 


82 


_ 


G 


A 


A 


T 


T 


G 


T 


G 


A 


A 


A 


T 


G 


T 


G 


A 


T 


T 


G 


A 


T 


A 


T 


A 


83 


_ 


A 


A 


A 


T 


A 


A 


G 


T 


A 


G 


T 


T 


T 


A 


A 


T 


G 


A 


G 


A 


G 


A 


A 


G 


84 


_ 


G 


A 


T 


T 


A 


A 


A 


G 


A 


A 


G 


T 


A 


A 


G 


T 


G 


A 


A 


T 


G 


T 


T 


T 


85 


_ 


T 


A 


T 


G 


T 


G 


T 


G 


T 


T 


G 


T 


T 


T 


A 


G 


T 


G 


T 


T 


A 


T 


T 


A 


86 


_ 


G 


A 


G 


T 


T 


A 


T 


A 


T 


G 


T 


A 


G 


T 


T 


A 


G 


A 


G 


T 


T 


A 


T 


A 


87 




G 


A 


A 


A 


G 


A 


A 


A 


G 


A 


A 


G 


T 


G 


T 


T 


A 


A 


G 


T 


T 


A 


A 


A 


88 


_ 


T 


A 


G 


T 


A 


T 


T 


A 


G 


T 


A 


A 


G 


T 


A 


T 


G 


T 


G 


A 


T 


T 


G 


T 


89 


_ 


T 


T 


G 


T 


G 


T 


G 


A 


T 


T 


G 


A 


A 


T 


A 


T 


T 


G 


T 


G 


A 


A 


A 


T 


90 * 


_ 


A 


T 


G 


T 


G 


A 


A 


A 


G 


A 


G 


T 


T 


A 


A 


G 


T 


G 


A 


T 


T 


A 


A 


A 


91 


_ ■ 


G 


A 


T 


T 


G 


A 


A 


T 


G 


A 


T 


T 


G 


A 


G 


A 


T 


A 


T 


G 


T 


A 


A 


A 


92 




A 


A 


G 


A 


T 


G 


A 


T 


A 


G 


T 


T 


A 


A 


G 


T 


G 


T 


A 


A 


G 


T 


T 


A 


93 


17 


T 


A 


G 


T 


T 


G 


T 


T 


A 


T 


T 


G 


A 


G 


A 


A 


T 


T 


T 


A 


G 


A 


A 


G 


94 


_ 


T 


T 


T 


A 


T 


A 


G 


T 


G 


A 


A 


T 


T 


A 


T 


G 


A 


G 


T 


G 


A 


A 


A 


G 


95 


_ 


G 


A 


T 


A 


G 


A 


T 


T 


T 


A 


G 


A 


A 


T 


G 


A 


A 


T 


T 


A 


A 


G 


T 


G 


96 


18 


T 


T 


T 


G 


A 


A 


G 


A 


A 


G 


A 


G 


A 


T 


T 


T 


G 


A 


A 


A 


T 


T 


G 


A 


97 




A 


T 


G 


A 


A 


T 


A 


A 


G 


A 


G 


T 


T 


G 


A 


T 


A 


A 


A 


T 


G 


T 


G 


A 


98 




T 


G 


T 


T 


T 


A 


T 


G 


T 


A 


G 


T 


G 


T 


A 


G 


A 


T 


T 


G 


A 


A 


T 
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T 


365 


_ 


A 


G 


A 


T 


A 


T 


T 


T 


G 


A 


T 


A 


G 


A 


T 


A 


G 


A 


A 


G 


A 


A 


A 


G 


366 




A 


T 


T 


T 


A 


G 


A 


G 


T 


T 


G 


T 


A 


A 


G 


A 


A 


G 


A 


T 


A 


T 


T 


G 


367 




G 


A 


G 


A 


A 


A 


T 


T 


G 


T 


A 


A 


T 


T 


G 


T 


T 


A 


G 


A 


G 


T 


A 


T 


368 




G 


A 


A 


G 


T 


A 


T 


A 


T 


-G 


T 


T 


A 


A 


G 


A 


T 


G 


T 


A 


A 


T 


A 


G 


369 




A 


A 


T 


A 


T 


T 


G 


A 


A 


G 


A 


T 


G 


T 


A 


G 


T 


G 


A 


G 


T 


T 


A 


T 


370 




G 


A 


G 


T 


T 


T 


A 


G 


A 


A 


A 


T 


G 


A 


T 


A 


A 


A 


G 


A 


A 


T 


T 


G 


371 




T 


A 


A 


G 


A 


A 


A 


T 


G 


A 


G 


T 


T 


A 


T 


A 


T 


G 


T 


T 


G 


A 


G 


A 


372 ' 


60 


T 


T 


G 


A 


T 


A 


T 


A 


A 


G 


A 


A 


G 


T 


T 


G 


T 


G 


A 


T' 


A 


A 


G 


T 


373 




A 


A 


G 


T 


G 


T 


T 


T 


A 


.A 


T 


G 


T 


A 


A 


G 


A 


G 


A 


A 


T 


G 


A 


A 


374 


61 


G 


T 


T 


G 


T 


G 


A 


G 


A 


A 


T 


T 


A 


G 


A 


A 


A 


T 


A 


G 


T 


A 


T 


A 


" 37 5 




T 


T 


T 


A 


G 


T 


T 


T 


G 


A 


T 


G 


T 


G 


T 


T 


T 


A 


T 


G 


A 


G 


A 


T 


376 





98 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



p 

w 


T 


A 
in. 


p^ 


X 


T 

X 


p 


A 


A 


A 


p 


X 


A 


rp 
X 


p 


A 


p 
<j 


T 


A 


p 


T 

X 


A 
in. 


A 
r\ 


T 

X 




•377 




x 






T 


T 

X 




p^ 




T 






G 




T 


T 


G 


A 


G 




G 


j± 


A 


A 
r\ 


T 

X 




-3 / O 




T 


T 






p 


X 


p 


& 




G 


T 

X 


a 


T 


T 


G 


T 

X 


T 

X 


T 




T 

X 


T 


a 
\j 


A 


A 
t\ 




TTQ 




A 

in 


T 

X 


T 




A 
in. 


X 


T 

X 


T 


Q 


T 


T 

X 


G 








T 

X 


A 




Q 


X 




X 


T 
X 


P 




ion 
J 0 u 




T 
1 


r* 
\j 


A 


A 
in 


T 
X 


T 

X 


p 

kJJ 


T 

X 


T 

X 


p 


A 
t\ 


T 

X 


A 




G 


T 

X 


X 


A 


T 
X 


P 


A 


A 
A 


P 
w 


A 
A 




-) 01 
J 0 X 




L 


rp 


X 


T 


Lj 


X 


X 


A 
in 


T 

X 


X 


P 




p 


T 

X 


A 

t\ 


A 


p 


T 
X 


T 
X 


P 


A 
M. 


A 
A 


T 
X 


T 
X 




TOT 

J O Z 




X 


Lj 


A 
A 


X 


1 


T 
1 


A 
in. 


P 


T 

X 


A 


T 
X 




T 

X 


A 


T 

X 


X 


A 
M. 


p 


A 
r\ 


P 


X 


T 
X 


p 
Li 


A 
A 




J O O 




1 


A 
in 


A 
in. 


A 
M. 


X 


A 
M. 


P 


A 


p 


A 
r\ 


T 

X 


p 


A 
r\ 


p 


A 


A 


X 


A 
r\ 


A 
M. 


P 


A 
M. 


A 
r\ 


A 
r\ 


p 
Lr 




^ fl A 

JOT 




A 
A 


Lx 


A 
A 


A 
A 


rp 

X 


p 
Lj 


T 
X 


X 


A 
A 


T 

X 


A 
A 


T 
X 


P 
Li 


X 


A 


p 
Lj 


A 
A 


p 
Lj 


A 
A 


A 

A 


A 
A 


T 
X 


X 


p 
Lr 




ope 




A 
A 


X 


T 
X 


T 
X 


A 
A 


rp 

X 


p 


X 


A 
r\ 


p 


T 
1 


X 


P 


A 
r\ 


P 


A 


p 
Li 


X 


P 
Lj 


A 

A 


T 


A 
A 


A 
A 


A 
A 




J OD . . 




rj 
Li 


HP 


A 
A 


A 
A 


A 
A 


p 

Lj 


A 
A 


X 


A 
A 


P 
Lj 


T 
X 


TP 
X 


T 
X 


p 


A 
A 


p 
Lj 


rp 

X 


A 

A 


a 
A 


rp 
X 


T 

X 


rp 

X • 


p 
Lj 


A 
A 




TOT 
JO/ 




Lr 


A 

A 


A 
A 


A 
A 


rp 

1 


A 
A 


p 
Lr 


rp 

X 


A 
A 


rp 

X 


A 
A 


A 
A 


HP 
X 


p 
Lr 


T 
X 


X 


A 
A 


A 
A 


p 
Lj 


rp 
X 


p 
Lr 


A 
A 


p 
Lj 


A 
A 




■JQQ 




A 
A 


HP 
1 


T 
1 


p 


T 
1 


A 
A 


rp 

X 


A 
A 


T 
X 


rp 
X 


p 
Lr 


T 
X 


p 


X 


X 


Lj 


A 
A 


A 
A 


p 
Lj 


A 
A 


A 
A 


A 
A 


p 
Lj 


rp 

X 




0 0 y 




p 
Li 


A 


p 
Lr 


T 
1 


rp 

.1 


A 
A 


A 
A 


p 


T 
i 


P 

Li 


HP 
X 


A 
A 


A 
A 


A 


X 


p 
Lj 


A 
A 


A 
A 


a 
A 


X 


p 

Lj 


1 


A 
A 


A 
A 








A 


X 


A 
A 


p 
Lr 


A 


X 


rp 


p 
Lj 


rp 
X 


p 
Li 


X 


p 
Lr 


A 
A 


A 

A 


A 
A 


p 
Lj 


A 
A 


A 
A 


A 
A 


p 
Lr 


A 
A 


A 

A 


rp 

X 


rp 

1 








T 
X 


np 

X 


A 
A 


A 
A 


rp 
1 


A 
A 


p 
Li 


A 
A 


A 
A 


p 
Li 


rp 
1 


rp 

X 


rp 

X 


p 
Lj 


rp 
X 


A 
A 


p 
Lj 


rp 

1 


A 
A 


rp 
X 


p 
L 


A 
A 


rp 
X 


P 
Lr 




j y ^ 




1 


HP 


p 


1 


A 


rp 
1 


p 
Lr 


T 
1 


p 
Li 


A 
A 


P 

Lr 


A 
A 


A 
A 


rp 
X 


A 
A 


A 
A 


A 
A 


p 
Lr 


rp 

X 


rp 
X 


rp 

X 


A 
A 


p 

Lj 


rp 
X 




n q n 




p 

Li 


1 


p 
Lj 


A 
A 


rp 

X 


rp 
1 


A 
A 


p 


A 
A 


rp 
X 


A 
A 


T 
1 


p 
Lj 


A 
A 


X 


P 
Lj 


A 
A 


H" 1 
i 


A 
A 


HP 
X 


p 
Lr 


A 

A 


A 

A 


rp 

X 




j y 4 




1 


Li 


A 


A 


Lj 


A 
A 


A 

A 


Lr 


A 
A 


A 
A 


rp 
X 


rp 

X 


HP 
X 


A 
A 


p 
Lj 


A 
A 


rp 

X 


rp 
X 


rp 

X 


p 

Lr 


rp 

X 


A 

A 


A 

A 


Lr 




"5 O C 




1 


P 
Li 


rp 

X 


A 

A 


rp 
1 


p 

Lj 


A 
A 


rp 

X 


rp 

X 


A 

A 


rp 
X 


rp . 


. p 
Lr 


A 
A 


rp 

X 


rp 

X 


A 
A 


p 

Lr 


rp 

X 


p 
Lr 


rp 
1 


P 
Lr 


HP 

X 


rp 

X 




:5 y 0 




rp 
1 


P 
Lj 


rp 
i 


Lr 


A 


A 

A 


A 

A 


Kj 


A 

A 


L» 


A 
A 


A 

A 


rp 

X 


Lr 


A 

A 


rp 
X 


A 
A 


Lr 


A 

A 


rp 

X 


A 

A 


rp 

• X 


rp 
X 


rp 
X 




i y 7 




A 


A 


X 


rp 
1 


G 


A 


A 

A 


A 

A 


rp 
X 


Lr 


A 

A 


Lr 


rp 

X 


Lj 


rp 

X 


Lr 


rp 
X 


rp 
1 


rp 


A 
A 


A 

A 


G 


A 

A 


A 

A 




n q q 

i y 0 - 




A 


rp 


rp 


A 


rp 
1 


A 

A 


Lr 


A 

A 


Li 


rp 
X 


rp 
X 


A 

A 


p 
Li 


rp 

X 


rp 
X 


rp 


A 

A 


L 


A 

A 


A 

A 


rp 

X 


L 


A 

A 


Li 




1 no 

j y y 




A 


A 


A 


p 
Lr 


A 

A 


rp 
1 


A 

A 


r\ 


A 

A 


A 

A 


A 

A 


rp 


X 


p 
Li 


A 
A 


p 
Lr 


rp 
X 


p 

Lr 


rp 
X 


A 

A 


rp 
X 


Lr 


A 

A 


rp 
X 




/I A A 

4 (J U 




P 
Lj 


rp 
1 


A 


Lr 


rp 


rp 
1 


rp 

X 


p 

Lf 


HP 
X 


rp 
X 


A 

A 


A 

A 


rp 
X 


Lr 


rp 
X 


rp 
X 


Li 


rp 
1 


A 

A 


rp 
X 


A 

A 


A 


rp 

X 


Lj 




4 01 




A 

A 


Lj 


A 


p 
L 


A 

A 


rp 


A 

A 


rp 
1 


rp 

X 


A 

A 


Li 


A 

A 


A 

A 


rp 
X 


p 

Li 


X 


A 

A 


A 

A 


"p 

Lr 


A 

A 


A 

A 


rp 
1 


A 

A 


Lr 




4 0 z 


64 


A 


p 


A 
A 


A 
A 


Li 


rp 
1 


rp 
X 


rp 
X 


p 
Li 


A 

A 


A 

A 


A 

A 


rp 

X 


A 
A 


rp 
X 


p 
Li 


A 

A 


rp 

X 


A 
A 


p 

Li 


A 

A 


A 

A 


rp 
X 


Lj 




4 U J 




T 


A 


P 
Lr 


A 
A 


A 
A 


rp 
X 


p 

Lr 


rp 


A 
A 


A 
A 


A 
A 


p 

Lr 


rp 
X 


rp 

X 


HP 
X 


A 
A 


p 
Lr 


rp 
X 


A 
A 


rp 
X 


A 
A 


P 
Lr 


A 
A 


P 
.Lj 




/i n /i 
4 U4 




A 


p 
Lr 


rp 

1 


A 
A 


Li 


A 

A 


rp 
X 


p 

L7 


rp 
1 


A 
A 


rp 
X 


p 

Lj 


HP 
X 


rp 
X 


A 
A 


A 
A 


HP 
X 


P 
Lr 


rp 

X 


p 
Lj 


A 
A 


A 

A 


rp 

X 


A 

A 




4 U D 




1 


p 
Li 


A 
A 


A 
A 


A 
A 


P 

Li 


T 

X 


p 
Li 


A 
A 


A 

A 


A 
A 


HP 
X 


' A 

A 


rp 
X 


p 
Lr 


A 
A 


A 
A 


A 
A 


rp 
X 


p 
Lj 


rp 
1 


rp 
1 


p 
Lj 


rp 
X 




4 U 0 




A 
A 


rp 
1 


A 
A 


n 


rp 


A 
A 


rp 

X 


A 
A 


rp 
X 


HP 
X 


p 
Lr 


A 
A 


p 
Lj 


X 


T 
X 


HP 
X 


p 
Lr 


HP 


A 
A 


rp 

X 


p 
Lr 


A 
A 


A 
A 


P 
Lr 




4 U / 




p 
L 


A 
A 


A 
in 


p 
L 


A 
A 


A 
A 


A 
A 


1 


p 

Li 


X 


rp 

X 


T 
X 


p 
Li 


X 


A 

M. 


c 

VJ 


A 
A 


A 
A 


X 


A 
A 


A 
A 


p 
Li 


rp 

X 


A 

A 




A n Q 
4 U 0 




n 
A 


A 
A 


TP 
1 


p 
Lj 


A 
A 


p 
Lr 


X 


A 
A 


X 


rp 

X 


p 
Lr 


A 
A 


A 
A 


p 


A 
A 


A 
A 


A 
A 


X 


p 
Lj 


X 


A 
A 


X 


A 
A 


p 
Lr 




4 u y 




p 


X 


P 


A 
in 


HP 


A 
A 


p 
Lj 


A 
M. 


A 


T 
X 


X 


T 
X 


P 
vj 


T 

X 


P 


X 


X 


T 
X 


A 


A 
r\ 


X 


p 

Li 


A 

A 


A 
A 




*i X U 


D O 


T 
1 


p 
Li 


T 
i 


A 
A 


p 

Li 


1 


A 
A 


rp 
X 


p 
Lj 


A 

A 


A 
A 


p 
Lj 


A 
A 


A 
A 


X 


A 
A 


A 
A 


rp 
X 


p 
Lr 


A 
A 


A 

A 


A 
A 


HP 
X 


p 
Li 




ATT 
4 X X 






T 

X 


A 

in 


p 


A 
in 


A 
in 


P 


T 

X 


T 

X 


A 
r\ 


A 


T 

X 


P 


A 


X 


A 


A 
r\ 


T 

X 


HP 
X 


P 


T 

X 


p 
Lj 


X 


p 
Lr 




A 1 9 
ft X Z 




P 
L 


T 
X 


P 
w 


A 
in 


X 


T 

. X 


p 


T 
X 


A 


A 
A 


p 


rp . 
X 


A 


A 


p 

vjr 


X 


A 


A 


A 


p 


A 
M. 


T 
1 


A 
A 


A 
A 




A 1 7 
4 X O 




T 


A 
in 


T 


P 


T 

X 


A 
in 


p 

\J 


T 

X 


T 

X 


T 


P 


T 

X 


G 


T 


T 


A 


T 

X 


T 

X 


T 


p 


A 


A 


P 


A 
M. 




ATA 
1 Xt 




T 


G 




Q 


T 

X 


A 
in 


A 
in 


p 
\j 


T 


X 


T 

X 


G 


T 


A 


T 


P 


T 


T 


T 


A 




P 


T 

X 


A 
c\ 




A T R - 
** X D 


D / 


T 








T 

X 


P 
Vj 


T 

X 


j\ 


T 


p 

w 


A 


p 

VJ 


T 


G 


T 




T 


A 


^ 




P 


A 
r\ 


A 
in. 


A 
r\ 




/ 1 C 
1IO 




p 


T 

X 


A 




r* 

W 


A 
M. 


\j 


T 

X 


A 


T 

X 


T 

X 


p 

vjr 


A 




A 


T 

X 


T 

X 


A 


p 


T 


A 


A 
r\ 




A 
r\ 




A 1 7 


D O 


Q 


T 


T 




A 
rV 


P 


T 

X 


G 


T 


A 






G 




T 


T 




T 


T 


G 




T 
X 




A 
in. 




1x0 








X 


A 
in 


T 

X 


p 


A 
in 


p 


T 
X 


T 

X 


A 
r\ 


T 

X 


T 

X 


A 
c\ 


p 

v_J 


A 


X 


A 

M, 


A 
r-\. 


A 
/A 


p 


T 

X 


P 
w 


A 
in 




A T Q 




A 


HP 
1 


1 


T 
X 


p 
Lr 


rp 
i 


rp 

X 


A 
A 


T 
X 


A 
A 


Li 


A 
A 


p 
Lj 


X 


X 


p 
Lj 


np 
X 


P 
Lr 


T 
X 


T 

X 


p 
Lj 


rp 
1 


A 
A 


rp. 

X 




a h> n 

4 Z U 




T 


A 


A 


T 


T 


A 


G 


T 


A 


G 


T 


G 


T 


G 


T 


T 


G 


A 


A 


A 


T 


T 


T' 


G 




421 




T 


G 


T 


A 


T 


T 


G 


A 


G 


A 


T 


T 


G 


T 


T 


A 


T 


T 


G 


T 


A 


T 


T 


G . 




422 




G 


T 


T 


A 


T 


T 


A 


G 


A 


A 


G 


A 


G 


A 


T 


A 


A 


T 


T 


G 


A 


G 


T 


T 




42 3 




T 


T 


G 


A 


G 


T 


T 


G 


T 


G 


A 


T 


T 


A 


A 


G 


T 


A 


G 


T 


A 


T 


A 


T 




424 




G 


A 


T 


A 


G 


T 


A 


T 


A 


A 


T 


G 


A 


T 


T 


G 


A 


A 


G 


T 


A 


A 


T 


G 




425 




G 


T 


G 


A 


A 


A 


G 


A 


T 


A 


T 


T 


T 


G 


A 


G 


A 


G 


A 


T 


A 


A 


A 


T 




426 




A 


G 


T 


T 


A 


T 


G 


A 


T 


T 


T 


G 


A 


A 


G 


A 


A 


A 


T 


T 


G 


T 


T 


G 




427 




G 


T 


A 


A 


G 


T 


A 


T 


T 


T 


G 


A 


A 


T 


T 


T 


G 


A 


T 


G 


A 


G 


T 


T 




428 




T 


A 


A 


T 


A 


G 


T 


G 


T 


T 


A 


T 


A 


A 


G 


T 


G 


A 


A 


A 


G 


A 


G 


T 




429 




A 


A 


A 


T 


G 


A 


A 


T 


T 


G 


A 


T 


G 


T 


G 


T 


A 


T 


A 


T 


G 


A 


A 


G 




430 





99, 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


G 


A 


A 


A 


G 


T 


G 


A 


G 


T 


T 


G 


T 


T 


A 


A 


G 


T 


A 


T 


T 


T 


A 


t J X 




T 


T 


x 


A 


T 


G 


T 


G 


T 


G 


A 


A 


T 


T 


G 


T 


G 


T 


A 


T 


A 


T 


A 


G 


432 




G- 


T 


A 


A 


T 


A 


T 


G 


A 


T 


A 


G 


A 


A 


A 


T 


G 


T 


A 


A 


A 


VJ 


A 


VJ 


i j j 






A 


G 


A 


A 


T 


T 


G 


T 


T 


T 


A 


A 


A 


G 


A 


T 


A 


G 


T 


T 


G 


T 


A 


4 7 4 
*± j i 




Q 


A 


A 


T 


T 


T 


G 


T 


T 


A- 


A 


G 


A 


A 


T 


G 


A 


G 


T 


T 


T 


G 


A 


T 

X 


^ j j 




A 


T 


A 


G 


T 


G 


A 


T 


G 


A 


T 


T 


A 


A 


A 


G 


A 


G 


A 


A 


T 


T 


T 


ft 

vj 


4 7^ 




A 


T 


A 


G 


A 


T 


G 


T 


T 


T 


A 


G 


T 


T 


G 


A 


G 


A 


T 


T 


A 


T 

X 


T 

X 


ft 
vJ 


4 7 7 
i j / 




A 


A 
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A 


X 


rp 

X 


Lj 


rp 
1 


rp 


a 
A 


Lj 


A 

A 


Lr 


A 
A 


505 




Lj 


A 
A 


p 
Lj 


1 


p 


A 


A 


rp 
X 


A 
A 


Lj 


A 

A 


A 

A 


A 
A 


P 
Lj 


A 
A 


rp 
X 


A 

A 


T 

X 


p 
Lj 


X 


rp 

X 


A 

A 


A 

A 


rp 


C A ^ 

b U 0 . 




A 


A 
• A 


A 
A 


lj 


A 


p 


A 
A 


rp 
X 


A 
A 


T 
1 


rp 
X 


p 
Lj 


A 
A 


A 
A 


p 

L7 


A 
A 


p 
Lj 


A 
A 


A 
A 


T 
X 


A 
A 


A 

A 


A 

A 


Lj 


C A *7 




p 
Lj 


T 

1 


T 
I 


A 
A 


1 


A 


p 
Lr 


A 


A 
A 


rp 
1 


A 

A 


a 
A 


p 
Lj 


T 
X 


T 
X 


p 

Lj 


X 


A 
A 


A 

A 


A 
A 


p 
Lj 


rp 


Lj 


rp 


C A Q 




T 
I 


Lj 


A 
A 


1 


A 


Lr 


X 


A 


X 


Lr 


A 

A 


T 
X 


A 
A 


A 
A 


X 


p 
Lj 


rp 
L 


p 
Lj 


rp 
X 


T 
X 


rp 

X 


A 

A 


rp 


Lj 


tZ A Q 

bo y 




1 


1 


1 


p 
Lj 


rp 
X 


rp 
1 


p 
Lr 


rp 
X 


T 
X 


A 
A 


A 
A 


p 
Lr 


T 
X 


A 
A 


rp 

X 


p 
Lj 


rp 

X 


p 
Lj 


A 
A 


T 
X 


rp 

X 


rp 
X 


A 

A 


P 
Lj 


C T A 

b 1 U 


/ / 


1 


A 

A 


A 
A 


A 
A 


Lj 


rp 


p 
Lj 


X 


X 


p 


rp 

X 


p 
Lj 


rp 

X 


X 


A 
A 


A 
A 


A 
A 


p 
Lr 


A 

A 


rp 
X 


T 
X 


A 

A 


A 

A 


Lj 


b 1 ± 




T 
1 


Lj 


T 
X 


p 
Li 


1 


T 
i 


T 
X 


p 
Lj 


A 
r\ 


T 
1 


X 


p 
Lj 


A 
A 


X 


T 
X 


A 
A 


A 
A 


X 


p 
Lj 


rp 
X 


X 


A 
A 


T 
X 


p 
Lj 


b Iz 




A 

A 


T 
1 


1 


A 
A 


A 

A 


T 
1 


p 
Lj 


A 
A 


A 
Pi. 


T 


p 
Lj 


a 
A 


p 
Lj 


T 
X 


p 
Lj 


T 
X 


rp 

X 


p 
Lj 


T 
X 


A 
A 


A 
A 


rp 

X 


P 
Lj 


X 


b 1 J 




T 
1 


A 

A 


p 
Lj 


A 
A 


rp 
X 


p 
Lr 


rp 


rp 

X 


T 
X 


n 


X 


p 
Lj 


A 

A 


p 
Lr 


rp 
X 


X 


rp 
X 


p 
Lj 


A 

A 


rp' 
X 


A 
A 


rp 

X 


X 


A 
A 


b 14 




p 
Lj 


A 
A 


A 
A 


T 


p 
Lj 


A 

A 


A 

A 


X 


A 


p 
Lj 


rp 
X 


a 
A 


A 

A 


X 


A 
A 


p 
Lj 


A 
A 


X 


p 


A 
A 


X 


rp 

X 


X 


p 
Lj 


CI c 

bib 




A 
A 


A 
A 


T 
1 


A 
A 


p 
Lj 


1 


p 


X 


p 


T 
X 


X 


p 


T 

X 


T 
X 


A 
A 


T 
X 


A 
A 


X 


p 
Lj 


A 
A 


T 
X 


rp 
X 


A 
A 


p 
Li 


bio 




T 


A 
A 


Lr 


A 
A 


rp 
X 


T 
1 


A 


p 


A 


A 
A 


p 


A 
A 


X 


p 


T 
X 


T 
X 


p 
Lj 


X 


p 
Lj 


X 


A 
A 


rp 
X 


rp 

X 


A 
A 


c: 1 n 
b 1 / 




A 
i-v 


A 
A 


T 


LJ 


X 


p 
Lj 


X 


p 


T 
X 


P 


X 


rp 
X 


A 
A 


A 
A 


A 


X 


p 
Lj 


A 
A 


A 
A 


T 
X 


X 


T 
X 


p 
Lj 


X 


b x 0 




p 


r\ 




T 


T 

X 


a' 

M. 


A 


p 


T 

X 


A 
r\ 


X 


A 
M. 


X 


P 




p 

VJ 




p 

vJ 


X 


A 
r\ 


p 


A 
A 


A 
A 


A 
A 


, ji j 




T 
i. 


T 


fa 


T 


T 

X 


P 


T 

X 


P 


T 


G 


T 

X 


A 


A 


P 

w 


T 

X 


A 


P 
LJ 


T 

X 


p 


T 


fa 


A 
M. 


A 
r\ 


T 

X 


3ZU 




Q 


T 


fa 


G 


T 

X 


A 


A 


A 




A 


p 


fa 


fa 


T 


T 


p 


X 


T 

X 


rp 
X 


fa 


P 
\J 


X 


A 


T 

X 


OZ X 


ft n 

0 u 


fa 




G 


T 


T 

X 


T 

X 


P 


T 

X 


A 


A 


p 

VJJ 


fa 


fa 


G 


T 


fa 


p 


T 

X 


T 


ri 


A 


A 


T 
X 


A 
M. 


c; 0 0 




A 


Q 


T 


T 




T 


A 


G 


T 


A 
in. 


T 

X 


fa 


G 


T 




G 


T 


A 


T 


fa 


G 


A 


P 
VJ 


A 


J« J 




G 


A 








fa 




^ 


T 


G 


T 


G 


T 




T 


A 


p- 

vJ 


T 


T 


T- 


fa 


A 


T 

X 


\J 


c 9 4 
O Z ft 




T 


T 


G 


T 


G 


fa 


G 


T 


A 




T 


G 






T 


G 




T 


G 


T 


fa 


T 
X 


T 

X 


A 
ri 


R ? R 
jz j 




G 


T 


A 


G 


A 


G 


T 


T 


G 


T 


A 


fa 


A 


T 


A 


G 


A^ 


G 


A. 


A 


T 


A 


A 


fa 


52 6 




A 


T 


T 


A^ 


fa 


T 


G 


T 


j\ 


G 




T 


T 


G 


T 


A 


A 


G 


fa 


G 


A_ 


T 


fa 


G 






T 


T 


A 
A 


p 
lj 


1 


p 


1 


p 




T 
X 


X 


p 


X 


A 
A 


p 

Lj 


A 


T 
X 


A 
A 


p 

Lr 


A 
A 


A 
A 


rp 

X 


X 


A 
A 


CTO 

b z 0 




A 


G 


A 


G 


A 


G 


T 


T 


T 


G 


T 


G 


T 


A 


T 


A 


T 


G 


T 


A 


T 


A 


A 


A 


52 9 


81 


T 


T 


A 


A 


G 


T 


T 


T 


A 


G 


T 


G 


A 


G 


A 


T 


T 


T 


G 


T 


T 


A 


A 


G 


530 




A 


T 


G 


A 


A 


G 


T 


T 


T 


A 


T 


T 


G 


A 


A 


T 


A 


G 


T 


A 


G 


T 


G 


A 


531 




A 


T 


A 


T 


T 


T 


G 


T 


G 


T 


T 


G 


T 


A 


T 


G 


T 


T 


T 


G 


T 


G 


A 


A 


532 




A 


A 


A 


G 


T 


G 


T 


T 


T 


A 


.T 


A 


G 


A 


A 


G 


A 


T 


T 


T 


G 


A 


T 


G 


533 




A -A 


G 


A 


G 


A 


T 


A 


T 


G 


A 


T 


T 


T 


G 


T 


T 


A" 


G 


T 


T 


G 


T 


A 


534 




A 


A 


G 


A 


A 


G 


A 


A 


A 


T 


G 


A 


G 


T 


G 


A 


T 


A 


A 


T 


G 


T 


A 


A 


535 




T 


A 


G 


T 


G 


T 


T 


T 


G 


A 


T 


A 


T 


G 


T 


T 


A 


A 


G 


A 


A 


G 


T 


T 


536 




G 


T 


A 


G 


A 


A 


A 


G 


T 


G 


A 


T 


A 


G 


A 


T 


T 


A 


G 


T 


A 


A 


T 


A 


537 




G 


A 


T 


A 


A 


A 


T 


G 


T 


T 


A 


A 


G 


T 


T 


A 


G 


T 


A 


T 


G 


A 


T 


G 


538 





101 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


G 


A 


T 


T 


A 


G 


A 


A 


G 


A 


A 


T 


rp 
1 


G 


rp 
1 


1 


1 


7\ 


G 


A 

A 


A 

A 


rp 
1 


b 






A 


T 


A 


T 


T 


T 


G 


A 


G 


A 


A 


G 


T 


G 


T 


b 


A 

A 


A 
A 


A 


rp 

I 


b . 


A 

A 


A 


T 


C A (\ 




T 


G 


A 


G 


T 


A 


A 


A 


T 


A 


G 


rp 


rp 

1 


1 


A 


rp 


b 


A 

A 


b 


rp 


A 

A 


b 


rp 
1 


A 

A 


o4 1 




T 


T 


A 


G 


A 


G 


A 


G 


T 


A 


G 


A 


T 


A 


A 


A 
A 


b 


A 

A 


rp 


rp 

T 


rp 


b 


A 
A 


T 


b4^J 




A 


T 


T 


G 


T 


T 


T, 


A 


A 


G 


T 


T 


G 


T 


T 


G 


A 


T 


A 


A 


b 


A 

A 


rp 

T 


G 


54 3 




G 


T 


T 


G 


T 


A 


A 


A 


G 


T 


T 


A 


A 


A 


G 


T 


G 


T 


G 


A 


A 


T 


T 


T 


544 




A 


T 


A 


G 


A 


T 


T 


G 


T 


G 


T 


G 


T 


T 


T 


G 


T 


T 


A 


T 


A 


G 


T 


A 


545 




G 


T 


A 


A 


G 


T 


T 


A 


T 


T 


G 


A 


G 


A 


A 


T 


G 


A 


T 


A 


A 


T 


A 


G 


546 




T 


A 


G 


A 


T 


T 


A 


G 


T 


T 


G 


A 


T 


A 


A 


G 


T 


G 


T 


G 


T 


A 


A 


T 


547 


83 


A 


A 


A. 


T 


G 


T 


A 


A 


A 


T 


G 


A 


A 


G 


A 


G 


T 


G 


T 


T 


T 


G 


T 


T 


548 




G 


A 


T 


A 


G 


A 


A 


G 


A 


A 


A 


T 


G 


T 


A 


T 


A 


T 


A 


G 


T 


G 


A 


T • 


549 




T 


A 


T 


A 


G- 


A 


G 


T 


G 


T 


A 


T 


G 


T 


T 


A 


T 


G 


A 


T 


A 


A 


A 


G 


550 




T 


A 


T 


G 


A 


A 


G 


T 


G 


A 


T 


A 


A 


G 


A 


T 


G 


A 


A 


G 


A 


A 


T 


T 


551 




T 


G 


T 


T 


G 


A 


G 


A 


A 


T 


A 


G 


T 


A 


A 


G 


A 


G 


A 


A 


T 


T 


T 


A 


552 




T 


A 


G 


A 


T 


A 


A 


T 


G 


T 


G 


A 


A 


G 


T 


A 


A 


T 


A 


A 


G 


T 


G 


A 


553 


84 


G 


T 


A 


T 


T 


A 


T 


G 


A 


T 


G 


A 


T 


A 


G 


T 


A 


G 


T 


A 


A 


G 


T 


A 


554 




A 


G 


A 


T 


A 


T 


G 


A 


T 


T ■ 


T 


A 


G 


T 


A 


T 


T; 


G 


A 


A 


■T 


G 


T 


G 


555 




A 


A 


T 


T 


A 


A 


G 


T 


T 


T 


G 


T 


A 


G 


A 


G 


T 


G 


A 


T 


T 


T 


G 


A 


556 




A 


A 


G 


A 


A 


A 


T 


A 


G 


A 


T 


G 


T 


A 


G 


T 


A 


A 


G. 


A 


T 


G 


T 


T 


557 




T 


T 


G 


.A 


G 


A 


A 


G 


T 


T 


G 


T 


T 


G 


T 


A 


A 


T 


A 


A 


G 


A 


A 


T 


558 




A 


G 


T 


G 


T 


G 


A 


A 


A 


T 


A 


G 


,T 


G 


A 


A 


A 


G 


T 


T 


T 


A 


A 


A 


559 




T 


T 


T 


A 


T 


G 


T 


A 


G 


T 


A 


G 


A 


T 


T 


T 


A 


T 


G 


T 


G 


A 


A 


G 


560 




A 


T 


T 


A 


A 


T 


G 


A 


G 


A 


A 


A 


T 


T 


A 


G 


T 


G 


T 


G 


T 


T 


A 


G 


561 




A 


T 


G 


T 


T 


A 


A 


T 


A 


G 


T 


G 


A 


T 


A 


G 


T 


A 


A 


A 


G 


T 


G 


A 


562 




T 


A 


T 


G 


T 


T 


G 


A 


T 


A 


A 


A 


T 


G 


A 


T 


T 


A 


T 


G 


A 


G 


T 


G 


563 




T 


T 


A 


T 


T 


A 


G 


A 


G 


T 


T 


G 


T 


G 


T 


G 


T 


G 


A 


T 


A 


T 


A 


T 


564 




T 


G 


T 


T 


G 


T 


T 


A 


T 


G 


A 


T 


T 


G 


A 


G 


T 


T 


A 


G 


A 


A 


T 


A 


565 




A 


A 


T 


T 


T 


G 


A 


G 


T 


T 


A 


A 


G 


A 


A 


G 


A 


A 


G 


T 


G 


T 


A 


A 


566 




A 


A 


A 


G 


A 


T 


A 


A 


A 


G 


T 


T 


A 


A 


G 


T 


G 


T 


T 


T 


G 


T 


A 


G 


567 


88 


T 


G 


T 


T 


G 


A 


G 


A 


T 


G 


A 


T 


A 


T 


T 


G 


T 


A 


T 


A 


A 


G 


T 


T 


568 




T 


A 


A 


A 


T 


A 


G 


T 


G 


A 


A 


T 


G 


A 


G 


T 


T 


A 


T 


A 


G 


A 


G 


T 


569 




A 


T 


A 


G 


A 


T 


G 


T 


T 


A 


T 


G 


A 


T 


A 


G 


T 


T 


A 


G 


T 


T 


A 


G 


570 




G 


T 


T 


A 


A 


G 


T 


G 


A 


A 


G 


A 


T 


A 


T 


G 


T 


A 


T 


T 


G 


T 


T 


A 


571 




T 


A 


A 


G 


A 


A 


A 


G 


T 


A 


A 


A 


G 


T 


T 


T 


G 


T 


A 


G 


A 


T 


G 


T 


572 




A 


A 


G 


A 


G 


A. 


A 


A 


G 


T 


T 


T 


G 


A 


T 


T 


G 


A 


A 


T 


A 


A 


A 


G 


r T 1 

573 




A 


T 


A 


T 


T 


A 


G 


A 


T 


G 


T 


G 


A 


G 


T 


T 


A 


T 


A 


T 


G 


T 


G 


rp 

T 


5 74 




A 


G 


T 


T 


T 


G 


A 


G 


T 


T 


T 


A 


G 


T 


A 


T 


T 


G 


T 


G 


A 


A 


T 


•A 


575 




A 


T 


G 


T 


T 


A 


A 


A 


T 


G 


A 


G 


A 


b 


A 


T 


T 


G 


rp 


b 


T 


A 


rp 
1 


A 

A 


r qr 

. b/o 




T 


A 


A 


A 


T 


G 


T 


T 


G 


T 


G 


A 


T 


T 


A 


rp 

T 


T 


b 


rp 
1 


b 


A 

A 


b 


A 
A 


rp 
1 


C- r-f rj 

b7 / 




T 


A 


A 


G 


A 


A 


T 


T 


G 


A 


A 


b 


rp 

T 


A 


A 


r~i 
Cj 


A 


b 


rp 


rp 
J. 


A 


rp 


rp 
i. 


b 


c n Q 

b /a 




A 


G 


A 


G 


A 


T 


A 


G 


A. 


A 


rp 

T 


rp 


A 


A 


b 


rp 


rp 
1 


rp 
1 


b 


1 


I 


b 


A 
A 


rp 
1 


end ■ 

b i y 




G 


A 


A 


G 


A 


A 


T 


G 


T 


T 


A 


A 


G 


A 

A 


A 

A 


A 

A 


rp 
1 


A 
A 


rp 
1 


b 


rp 


A 

A 


A 

A 


b 


c a a 




T 


A 


T 


T 


T 


G 


T 


G 


A 


rn 

1 


. T 


A 


A 

A 


b 


A 

A 


A 


b 


rp 
1 


rp 
1 


b 


A 

A 


ft 
b 


A 

A 


A 

A 


b o 1 




A 


G 


T 


T 


A 


G 


A 


A 


T 


T 


T 


G 


T 


G 


T 


A 


G 


T 


A 


G 


A 


A 


T 


T 


582 




A 


A 


G 


T 


T 


T 


A 


T 


T 


G 


T 


T 


G 


A 


T 


G 


T 


T 


G 


T 


A 


T 


T 


G 


583 




G 


A 


A 


T 


G 


A 


G 


T 


T 


T 


A 


A 


G 


A 


G 


T 


T 


T 


A 


T 


A 


G 


T 


A 


584 




A 


G 


T 


G 


A 


A 


G 


A' 


T 


T 


G 


T 


A 


T 


G 


T 


A 


G 


T 


A 


T 


A 


A 


A 


585 




A 


G 


T 


T 


G 


A 


A 


A 


T 


G 


A 


G 


T 


A 


T 


T 


A 


A 


G 


T 


A 


A 


T 


G 


586 




A 


T 


G 


T 


G 


T 


T 


A 


T 


T 


T 


G 


A 


G 


A 


T 


G 


A 


G 


T 


A 


A 


T 


T 


587 




A 


A 


A 


T 


A 


G 


T 


G 


T 


T 


G 


T 


T 


G 


A 


A 


G 


T 


T 


G 


T 


T 


A 


T 


588 




G 


T 


A 


G 


A 


G 


A 


A 


A 


G 


A 


T 


A 


T 


A 


T 


G 


T 


A 


G 


T 


T 


T 


A 


589 




G 


A 


G 


A 


G 


T 


A 


T 


T 


T 


G 


A 


T 


G 


A 


A 


T 


G 


A 


T 


T 


A 


T 


A 


590 




G 


A 


G 


T 


A 


T 


A 


A 


G 


T 


T 


T 


A 


G 


T 


G 


T 


A 


T 


A 


T 


T 


G 


A 


591 




A 


T 


A 


A 


T 


G 


T 


G 


A 


T 


T 


A 


T 


T 


G 


A 


T 


T 


G 


A 


G 


A 


G 


A 


592 
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Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



T 


T • 


A 


G 


T 


T 


G 


T 


T 


A 


T 


G 


T 


G 


A 


G 


A 


G 


T 


A 


A 


T 


A 


A 


593 




A 


A 


A 


T 


G 


A 


G 


T 


A 


T 


A 


T 


T 


G 


A 


A 


T 


T 


G 


T 


G 


A 


T 


G 


- 594 




A 


A 


T 


T 


A 


G 


A 


A 


G 


T 


A 


A 


G 


T 


A 


G 


A 


G 


T 


T 


T 


A 


A 


G 


595 


3 


T 


G 


T 


A 


A 


G 


T 


T 


T 


A 


A 


A 


G 


T 


A 


A 


G 


A 


A 


A 


T 


G 


T 


G 


596 " 


5 


G 


A 


A 


A 


T 


G 


A 


T 


A 


A 


G 


T 


T 


G 


A 


T 


A 


T 


A 


A 


G 


A 


A 


G 


597 




A 


A 


T 


G 


A 


G 


T ■ 


A 


G 


T 


T 


T 


G 


T 


A 


T 


T 


T 


G 


A 


G 


T 


T 


T 


598 




A 


G 


T 


G 


A 


A 


T 


G 


T 


A 


A 


G 


A 


T 


T 


A 


T 


G 


T 


A 


T 


T 


T ' 


G 


599 


$ 


G 


T 


A 


A 


T 


T' 


G 


A 


A 


T 


T 


G 


A 


A 


A 


G 


A 


T 


A 


A 


G 


T 


G 


T 


600 


8 


T 

1 


A 
in. 


T 


G 


T 

X 


T 

X 


T 


A 


A 


G 


T 


A 


G 


T 


G 


A 


A 


A 


T. 


A 


G 


A 


G 


T 


601 




n, 


T 


A 


T 


T 


G 


A 


A 


A 


T 


T 


G 


A 


A 


T 


T 


A 


G 


A 


A 


G 


T 


A 


G 


602 




A 
in. 


A 
in. 


T 

X 


A 


T 

X 




T 


A 


A 


T 


G 


T 


A 


G 


T 


T 


G 


A 


A 


A 


G 


T 


G 


A 


603 




T 

X 


ri 
\j 


A 


A 


T 

X 


A 


T 


T 


G 


A 


G 


A 


A 


T 


T 


A 


T 


G 


A 


G 


A 


G 


T 


T 


604 






A 
in. 




T 


ri 
\j 


T 

X 


A 


A 


A 


T 


G 


A 


T 


G 


A 


A 


G 


A 


A 


A 


G 


T 


A 


X 


60 5 






T 

X 


A 

in. 


T 

X 


r* 

V_J 


T 

. X 




T 


A 


A 


A 


G 


A 


A 


A 


T 


• T 


T 


G 


A 


T 


G 


T 


A 


fin fi 

o u o 






A 


T 


T 


Cl 


T 


T 


T 


G 


A 


A 


A 


G 


T 


T 


T 


G 


T 


T 


G 


A 


G 


A 


A 


607 




A 
in. 


A 
in. 


T 
X 


T 

X 




T 

X 


T 

X 


T 

X 


r* 


A 


ri 
\j 


T 

X 


A 


G 


T 


A 


T 


T 


A 


G 


T 


A 
in. 




T 

X 


DUO 




T 
x 


A 
in. 


A 

in. 


X 


T 

X 




A 
in. 


c 


T 

X 


T 


T 

X 




A 


A 


T 


A 


A 


G 


A 


G 


A 


G 


T 

X 


T 

X 


O U -7 




1 


ri 


X 


T 

X 


r* 

yj 


A 
in. 


X 


T 

X 


r* 


T 


A 
rn. 


A 


G 


T 


G 


T 


T 


T 


A 


T 


T 




T 


T 

x _ 


D X U 




r< 


A 
r\ 


A 
-en 


A 

in. 


X 


T 
X 


T 

X 




X 


r* 
\j 


A 
in. 


\j 


T 

X 


A 


T 


G 
vj 


T 

X 


A 


T 


T 

X 


T 


ri 


A 
in. 


A 
rn. 


D X X 




T 
1 


A 
r\ 


A 
in. 




A 
in. 


A 
in. 


X 




A 
in. 


A 
in. 


T 
X 


ri 
yj 


T 


G 


A 


A 




T 


ri 
<j 


A 


A 


T 

X 


A 
r\ 


T 

X 






X 


IS 
in. 


A 
in. 


T 
X 


p 


T 
i 


■ n. 

VJJ 


A 
in. 


A 
in. 


c* 

\J 


T 
X 


X 


T 

X 




T 

X 


r; 

Vj 


A 
in. 


A 
in. 


A 
in. 


n, 


A 

in. 


X 


A 
in. 


T 

X 


D X J 




1 


T 
X 




X 


A 
in. 


X 


A 
in. 


X 


ri 

\J 


A 
rn. 


A 
in. 


A 
in. 




T 

X 


A 


A 
in. 


V_J 


A 
in. 


A 
in. 


ri 
\j 


A 
in. 


A 
in. 




X 


D XI 




T 


in. 


a, 


A 
in. 


r» 
vj 


in. 




A 
in. 


A 
in. 


G 


A 
in. 


A 


G 


A 


A 


A 


T 


A 


A 


G 


A 


A 


T 


A 


DlJ 




A 
in. 


T 
X 


T 

X 


T 
X 


ri 


A 
in. 


A 
in. 


A 
in. 


T 
X 




X 


T 
X 


A 


A 


T 




A 


G 


A 


G 


A 


\J 


A 
rn. 


T 

X 


O X D 




1 


T 

X 




X 


ri 
\j 


X 


n 


T 

X 


A 
in. 


T 

X 


A 
in. 


T 

X 


A 




T 

X 


A 
in. 


T 

X 


X 


A 
in. 


\j 


A 
in. 


A 
in. 


T 
X 


a 
\j 


ox/ 




A 
in. 


T 

X 


T 

X 


ri 


T 
X 


T 
X 


A 
in. 


n 
\j 


T 
X 


A 
rn. 


X 


T 

X 


G 


A 


T 


r; 


T 

X 




A 


A 


G 


T 
X 


T 

X 


A 
rn. 


O X o 




T 




T 

X 


T 

X 


X 


r* 


T 
X 


A 

in. 


X 


T 


T 

X 




A 


A 


T 


G 


A 


A 


A 


T 


G 


A 
in. 


A 
in. 


G 


O X _? 




T 


G 


T 


T 


A 




A 


T 


T 


G 


T 


G 


T 


T 


A 


A 


A 


T 


G 


T 


A 


G 


T 


T 


• 62 0 




T 

X 


A 
in. 


T 

X 


A 
in. 


ri 
\j 


A 
in. 


VJ 


T 

X 


A 
in. 


T 


T 

X 




T 


A 


T 


A 


G 


A 


G 


A 


G 


A 
in. 


A 
rn. 


A 
rn. 


0^.1 




A 


A 


A 


T 


A 
in. 




T 


A 


A 


G 


A 


A 


T 


G 


T 


A 


G 


T 


T 


G 


T 


T 


G 


A 


62 2 




T 


G 


A 


G 


T 


G 


T 


G 


A 


T 


T 


T 


A 


T 


G 


A 


T 


T 


A 


A 


G 


T 


T 


A 


62 3 






G 


A 


A 


T 
X 


T 

X 


T 


G 


T 


T 


G 


T 


A 


G 


x 


G 


T 


T 


A 


T 


G 


A 


T 


T 


624 




G 


A 


T 


T 




A 


A 


G 


A 


A 


A 


G 


A 


A 


A 


T 


A 


G 


T 


T 


T 


G 


A 


A 


62 5 




Q 


A 


T 


A 


A 


T 


A 


G 


A 


G 


A 


A 


T 


A 


G 


X 


A 


G 


A 


G 


T 


T 


A 


A 


62 6 




G 


A 


T 


T 


G 


A 


A 


A 


T 


T 


T 


G 


T 


A 


G 


T 


T 


A 


T 


A 


G 


T 


G 


A 


627 




G 


A 


T 


T 


T 


A 


A 


G 


A 


A 


G 


A 


T 


G 


A 


A 


T 


A 


A 


T 


G 


T 


A 


G 


628 




T 


T 


T 


G 


A 


G 


A 


G 


A 


A 


A 


G 


T 


A 


G, 


A 


A 


T 


A 


A 


G 


A 


T 


A 


629 




G 


A 


T 


T 


A 


A 


G 


A 


G 


T 


A 


A 


A 


T 


G 


A 


G 


T 


A 


T. 


A 


A 


G 


A 


630 




T 


T 


T 


G 


A 


T 


A 


G 


A 


A 


T 


T 


G 


A 


A 


A 


T 


T 


T 


G 


A 


G 


A 


G 


631 




T 


G 


A 


A 


G 


A 


A 


G 


A 


G 


T 


G 


T 


T 


A 


T 


A 


A 


G 


A 


T 


T 


T 


A 


632 




G 


T 


G 


A 


A 


A 


T 


G 


A 


T 


T 


T 


A 


G 


A 


G 


T 


A 


•A 


T 


A 


A 


G 


T 


633 




A 


A 


A 


T 


A 


' A 


G 


A 


A 


T 


A 


G 


A 


G 


A 


G 


A 


G 


A 


A 


A 


G 


T 


T 


634 


9 


G 


T 


T 


G 


T 


A 


A 


A 


G 


T 


A 


A 


T 


A 


G 


A 


G 


A 


A 


A 


T 


T 


A 


G 


635 




A 


G 


T 


G 


A 
in. 


T 

X 


T 


T 


A 


G 


A 


T 


T 


A 


T 


G 


T 


G 


A 


T 


G 


A 


T 


T 


636 




A 


G 


A 


G 


T 


A 


T 


A 


G 


T 


T 


T 


A 


G 


A 


T 


T 


T 


A 


T 


G 


T 


A 


G 


• 637 




A 


T 


G 


A 


T 


T 


A 


G 


A 


T 


A 


G 


T 


G 


A 


A 


A 


T 


T 


G 


T 


T 


A 


G 


. '63 8. 




A 


T 


G 


A 


A 


A 


T 


G 


T 


A 


T 


T 


A 


G 


T 


T 


T 


A 


G 


A 


G 


T 


T 


G . 


639 




A 


T 


A 


T 


T 


G 


A 


G 


T 


G 


A 


G 


A 


G 


T 


T 


A 


T 


T 


G 


T 


T 


A 


A 


640 




- A 


G 


A 


T 


G 


T 


G 


T 


A 


T 


T 


G 


A 


A 


T 


T 


A 


A 


G 


A 


A 


G 


T 


T 


641 




T 


A 


A 


T 


G 


T 


G 


T 


T 


G 


A 


T 


A 


G 


A 


A 


T 


A 


G 


A 


G 


A 


T 


A 


642 




A' 


A 


A 


T 


T 


A 


G 


T 


T 


G 


A 


A 


A 


G 


T 


A 


T 


G 


A 


G 


A 


A 


A 


G 


643 


11 


T 


T 


T 


A 


G 


A 


G 


T 


T 


G 


A 


A 


G 


A 


A 


A 


T 


G 


T 


T 


A 


A 


T 


G 


644 




G 


A 


T 


T 


G 


T 


T 


G 


A 


T 


T 


A 


T 


T 


G 


A 


T 


G 


A 


A 


T 


T 


T 


G 


645 




T 


G 


T 


T 


G 


T 


T 


G 


T 


T 


G 


A 


A 


T 


T 


G 


A 


A 


G 


A 


A 


T 


T 


A 


646 





103 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


T 


T 


A 


A 


G 


T 


A 


A 


G 


A 


A 


T 


T 


G 


A 


G 


A 


G 


T 


T 


T 


G 


A 


647 


12 


G 


T 


A 


T 


G 


T 


T 


G 


T 


A 


A 


T 


G 


T 


A 


T 


T 


A 


A 


G 


A 


A 


A 


G 


64 8 


15 


T 


A 


G 


T 


T 


G. 


T 


G 


A 


T 


T 


T 


A 


T 


G 


T 


A 


A 


T 


G 


A 


T 


T 


G 


64 9 




T 


G 


A 


T 


A 


A 


T 


G 


A 


A 


A 


G 


T 


T 


T 


A 


T 


A 


G 


A 


G 


A 


G 


A 


O ZJ \j 




G 


T 


A 


A 


G 


A 


T 


T 


G 


T 


T 


T 


G 


T 


A 


T 


G 


A 


T 


A 


A 


G 


A 


T 


■J J J. 




T 


T 


g 

lj 


A 


A 


T 


T 


A 


A 


G 


A 


G 


T 


A 


A 


G 


A 


T 


G 


T 


T 


T 


A 


p 
\j 


D J£ 




A 


A 


p 

LJ 


T 

X 


P 
lj 


T 

X 


T 

X 


T 




T 


T 


T 


A 


G 


A 


G 


T 


A 


A 


A 


G 

LJ 


A 
ri. 


T 

X 


A 
rA 


UJJ 




A 


G 


A 


G 


A 


p 

LJ 


A 


T 


A 


A 


A 


G 


T 


A 


T 


A 


G 


A 


A 


G 


T 


T 


A 


A 


D D *± 




A 


T 


T 


A 
Pi. 


T 

X 


r« 

LJ 


A 
Pi 


A 

p\ 


T 

X 


A 


p 


T 

X 


T 

X 


A 

p\. 


p 


A 
ri. 


A 
r\ 


A 
ri. 


p 

LJ 


A 
ri. 


G 

LJ 


A 
ri. 


p 

Lj 


T 

X 


^ c; c 
ODD 




T 


T 


r» 

Li 


T 
1 


X 


Lj 


A 
Pi. 


T 

X 


A 
r\ 


T 
X 


T 


A 


G 


A 


G 


A 


A 


T 
X 




T 

X 


P 
Lj 


T 
X 


T 
X 


X 


DjO 




T 


T 


T 
X 


A 
ri. 


X 


X 


n 

LJ 


A 
P\ 


P 
\J 


A 


p 

VJ 


T 

X 


T 

X 


T 

X 


p 


T 

X 


T 
X 


A 
ri. 


T 

X 


X 


T 

X 


p 
Lj 


T 

X 


p 
Lj 


^ c; i 




A G 


X 


Lj 


T 

X 


T 

X 


A 

Pl 


A 
r\ 


P 


A 


A 


p 


T 

X 


T 

X 


p 


A 
ri. 


T 

X 


T 
X 


A 
ri. 


T 

X 


T 

X 


P 
Lj 


A 
ri. 


T 

X 


^ c; ft 

D J O 




r» 
Li 


A 

ri. 


p 
lj 


A 
ri. 


A 
ri. 


A 
ri. 


T 
1 


P 


A 
ir\ 


T 
X 


T 


G 


A 


A 


T 


G 


T 


T 


G 


A 
ri. 


T 

X 


A 
ri. 


A 
ri. 


X 


D D 27 




p 

Li 


A 

ri. ■ 


T 

X 


A 
ri. 


ri. 


p 
Lj 


T 
X 


A 


T 
X 


T 
1 


A 
ri. 


. p 


T 

X 


A 
Pi. 


T 


p 


A 
ri. 


p 


X 


p 

L3 


T 

X 


A 
ri. 


A 
Pi. 


T 
X 


c ^ n 

D O U 




T 
1 


1 


T 

X 


lj 


A 

ri. 


X 


T 


X 


A 


A 


P 


A 
rT. 


p 


T 
x 


P 


T 

X 


T 
X 


p 


A 
ri 


A 
ri. 


T 

X 


a, 

LJ 


T 

J- 


A 
ri. 


O O X 


_L D 


A 
ri. 


A 
ri. 


lj 


T 


X 


A 
ri. 


p 


X 


A 
Pk 


A 
Pi. 


A 


T 

X 


A 
Pi. 


P 
\j 


A 
ri 


p 


X 


A 
ri. 


p 
Lj 


A 
ri. 


A 
ri. 


A 
ri. 


P 
Lj 


A 
ri. 


^ <^ 0 




n, 


T 

X 


A 

ri. 


A 
ii. 


A 
ri. 


Lj 


T 
1 


A 


X 


P 


A 
M. 


A 
Pi. 


T 

X 


A 


T 
X 


p 

w 


X 


P 
Li 


ri. 


A 
ri. 


A 
ri. 


T 
X 


p 
Lj 


X 


^ ^ 7 




T 
X 


ft. 

ri. 


A 
ri. 


X 


A 
r\ 


A 
Pi. 


a 


X 


p 


T 
X 


P 


X 


X 


P 
Li 


T 
X 


p 

Lj 


A 
ri. 


A 
ri. 


T 
X 


p 
Lj 


X 


A 
Pi. 


a 
Pi. 


rrt 
X 


CCA 
D D 't 




A 
ri. 


A 
ri. 


A 
ri. 


r> 
lj 


A 
ri. 


X 


T 
X 


X 


A 


p 
Lj 


A 


P 
Li 


T 
X 


A 
Pi. 


p 


a 

ri. 


A 
Pi. 


A 
ri. 


p 

L7 


a 
P\ 


p 
Lj 


A 
ri. 


A 
ri. 


X 


ere 
ODD 




T 
1 


T 


A 

.H. 


r» 
L 


X 


T 
1 


X 


p 
Lj 


A 


p 


X 


T 
X 


p 
Li 


A 


A 
Pi. 


A 
ri. 


T 

X 


A 
ri. 


p 

L7 


X • 


A 
Pi 


A 


A 

Pi. 


p 
Lj 


CC C 
Quo 




T 
1 


A 
ri. 


A 


1 


A 
ri. 


Li 


rp 
X 


A 


X 


p 
Lj 


A 
Pi. 


p 
Li 


X 


A 
Pi. 


A 
r\ 


Lj 


A 
ri. 


X 


T 
1 


p 
Li 


A 

ri. 


A 
ri. 


A 
Pi. 


P 
Lj 


£Z CZ 1 

bo / 




fl 
\j 


A 
ri. 


A 
ri. 


Lj 


A 
Pi. 


X 


T 
X 


A 


p 


A 


X 


T 
X 


p 
Li 


A 


T 
1 


p 
Lj 


X 


X 


ri 


p 
Li 


T 
X 


X 


A 
Pi. 


A 
Pi. 


boo 




T 


A 
ri. 


A 
ri. 


A 
ri. 


n 

Lj 


A 
ri. 


p 
Lj 


A 
A 


p 


A 
Pi. 


A 
Pi. 


p 
Lj 


T 
X 


T 
X 


A 
Pi. 


p 


T 
X 


A 
ri. 


A 
ri. 


T 
X 


A 
Pi 


Lj 


A 
Pi. 


A 
ri. 


Q 

o o y 




T 


A 

ri. 


A 


p 
Lj 


X 


A 


T 
1 


p 


A 


p 


A 


A 
r\ 


A 
Pi. 


1 


P 


A 
ri. 


X 


P 
Li 


T 

1 


p 

Lj 


T 
X 


T 
X 


A 
ri. 


X 


C "7 n 

O / U 




r* 


A 

ri. 


p 
Lj 


T 
1 


T 
1 


1 


p 
Lj 


T 
X 


X 


X 


p 


X 


T 


A 


p 
Lr 


X 


X 


A 
Pi. 


X 


X 


p 
Lj 


A 
Pi. 


X 


A 
ri. 


b / 1 




A 


A 


p 

Li 


1 


A 


A 
Pi. 


A 
r\ 


p 


A 
Pi. 


A 


A 
Pi 


T 
X 


p 
Li 


X 


T 

X 


A 
ri. 


A 
r\ 


p 
Lr 


A 
Pi. 


p 
Lj 


T 
X 


A 

ri. 


p 
Lj 


X 


o / z 






T 


lj 


A 
ri. 


p 
lj 


A 
/i. 


A 
P. 


X 


X 


p 


T 

X 


X 


p 


T 
1 


T 


P 
Li 


A 
ri. 


A 
ri. 


A 
ri. 


T 
X 


p 
Li 


T 
X 


A 
ri. 


A 
Pi. 


b / o 




T 


T 


A 

ri. 


p 
Lj 


A 

ri. 


X 


T 
X 


A 
i-v 


p 


A 
Pi. 


p 


X 


A 


p 


T 


A 
ri. 


P 


A 
Pi. 


A 

ri. 


p 
Lj 


A 
ri. 


A 

ri. 


X 


A 
ri. 


b / *± 




T 


A 


r* 

Lj 


T 


Lj 


A 
ri. 


X 


P 

yj 


A 


A 


CI 


A 


A 


p 




T 
X 


A 
ri. 


p 
Lj 


A 
ri. 


A 
rA 


A 
ri. 


T 

X 


T 
X 


A 
ri. 


O / D 




T 


A 


A 

ri. 


T 

X 


r* 

LJ 


X 


A 
Pi. 


p 


X 


A 
Pi. 


A 
i-v 


X 


P 


T 

X 


p 


A 

ri. 


X 


p 


A 
ri. 


T 
X 


A 
ri. 


A 
ri. 


p 
Lj 


T 
X 


b / b 


• 


T 


T 


LJ 


A 
ri. 


LJ 


A 

ri. 


A 
ri. 


A 
r\ 


P 


A 


A 
Pi. 


T 

X 


A 


A 




T 


A 
ri. 


P 


T 
1 


p 

Lj 


T* 

X 


A 
ri. 


A 
ri. 


A 
ri. 


D / / 




T 


A 


A 
ri. 


T 


p 

VJ 


A 
ri. 


P 

lj 


T 
X 


P 

VJ 


A 


P 




T 


T 


A 


T 


A 
ri. 


P 


A 
ri. 


T 

X 


X 


P 
LJ 


T 
X 


T 
X 


67 ft 
o / o 




G 


T 




T 


A 
ri. 


A 
Jri. 


p 
lj 


A 
r\ 


A 
Pi. 


A 
Pi. 


T 




T 




T 




X 


T 

X 


T 
X 


p 

Lj 


A 
ri. 


X 


X 


A 

ri. 






G 


T 


p 

LJ 


A 
ri. 


A 
Pi 


T 
X 


P 


T 
X 


P 


X 


T 

X 


A 
P\. 


A 
rA 


T 

X 


p 


A 


A 


p 


A 
ri. 


X 


A 
Pi. 


T 

X 


A 
ri. 


T 
X 


c ft n 




G 


A 


A 


A 


G 


T 


T 


A 


T 


T 


A 


G 


T 


A 


G 


T 


T 


A 


A 


A 


G 


A 


T 


ri 
\j 


O O X 




T 


A 


G 


A 


A 


T 


T 


G 


T 


G 


T 


T 


T 


G 


A 


T 


A 


A 


G 


T 


G 


A 


T 


A 


o o 




T 


G 


^ 


T 


T 


T 


A 


G 


A 


T 


T 


G 


A 


G 


A 


G 


T 


T 


A 


A 


A 


T 


G 


A 


fift ^ 

OOJ 




A 


T 


T 


A 


T 


T 


G 


A 


G 


T 


T 


T 


G 


A 


A 


T 


G 


T 


T 


G 


A 


T 


A 


G 


684 




A 


T 


A 


G 


T 


A 


G 


T 


T 


A 


T 


G 


T 


T 


T 


G 


A 


T 


T 


T 


A 


G 


T 


G 


O O -J 




A 


T 


A 


G 


A 


A 


G 


A 


A 


G 


A 


A 


T 


A 


A 


A 


G 


T 


T 


A 


G 


A 


G 


A 


686 




G 


A 


T 


G 


T 


T 


G 


A 


A 


A 


G 


T 


A 


A 


T 


G 


A 


A 


T 


T 


T 


G 


T 


A 


687 




G 


A 


G 


A 


T 


T 


G 


A 


T 


A 


G 


T 


A 


G 


A 


A 


A 


T 


G 


A 


T 


A 


A 


A 


688 




T 


G 


A 


G 


A 


G 


A 


A 


T 


A 


A 


A 


G 


T 


A 


T 


G 


A 


A 


T 


T 


T 


G 


A 


68 9 




T 


A 


T 

X 


A 
ri. 


A 
ri 


A 
ri. 


p 


A 

p\ 


X 


p 


A 

p\ 


T 

X 


p 


T 

X 


p 


A 
ri. 


A 

ri. 


T 
X 


T 

, X 


A 
ri. 


p 

Lj 


X 


A 
ri. 


p 

Lj 


D J? u 




T 


T 


A 


T 


G 


T 


A 


A 


G 


A 


A 


T 


G 


T 


T 


T 


G 


A 


G 


A 


G 


A 


A 


A 


691 




A 


G 


T 


A 


A 


A 


T 


G 


A 


T 


G 


A 


A 


T 


G 


A 


T 


A 


T 


G 


A 


T 


G 


A 


692 




G 


A 


A 


A 


T 


T 


T 


G 


T 


G 


T 


T 


A 


A 


A 


g: 


T 


T 


G 


A 


A 


T 


G 


A 


693 




G 


A 


T 


G, 


A 


A 


T 


G 


A 


T 


T 


G 


T 


G 


T 


t 


T 


A 


A 


G 


T 


A 


T 


A 


694 




G 


A 


A 


A 


T 


A 


A 


G 


T 


G 


A 


G 


A 


G 


T 


T 


A 


A 


T 


G 


A 


A 


A 


T 


695 




T 


G 


T 


T 


G 


A 


A 


A 


T 


A 


G 


T. 


T 


A 


T 


T 


A 


G 


T 


T 


T 


G 


T 


G 


696 




T 


T 


T 


G 


A 


G 


A 


G 


T 


A 


T 


A 


T 


T 


G 


A 


T 


A 


T 


G 


A 


G 


A 


A 


' 6 97. 




A 


T 


T 


G 


T 


G 


T 


G 


T 


A 


A 


A 


G 


T 


A 


A 


G 


A 


T 


T 


T 


A 


A 


G 


698 . 




T 


A 


T 


A 


G 


T 


T 


T 


G 


A 


A 


G 


T 


G 


T 


G 


A 


T 


G 


T 


A 


T 


T 


T 


699 




G 


T 


G 


A 


A 


G 


T 


T 


A 


T 


A 


G 


T 


G 


T 


A 


T 


A 


A 


A 


G 


A 


A 


T 


700 





104 



Table II 

Sequence SEQ ID NO; No. in 

Ex 4 



G 


T 


A 


T 


G 


T 


T 


G 


A 


A 


T 


A 


G 


T 


A 


A 


A 


T 


A 


G 


A 


T 


T 


G 


701 




T 


T 


A 


G 


A 


A 


A 


G 


T 


G 


T 


G 


A 


T 


T 


T 


G 


T 


G 


T 


A 


T 


T 


T 


702 


_ 


T 


T 


T 


A 


G 


T 


A 


A 


T 


A 


T 


G 


T 


A 


A 


G 


A 


G 


A 


T 


G 


T 


G 


A 


703 


_ 


A 


G 


T 


A 


T 


G 


T 


A 


T 


A 


G 


A 


T 


G 


A 


T 


G 


T 


T 


T 


G 


T 


T 


T 


704 


_ 


A 


T 


T 


T 


A 


A 


G 


T 


A 


A 


A 


G 


T 


G 


T 


A 


G 


A 


G 


A 


T 


A 


A 


G 


705 


20 


A 


T 


T 


T 


G 


T 


G 


T 


T 


G 


A 


A 


T 


T 


G 


T 


A 


A 


A 


G 


T 


G 


A 


A 


706 




A 


T 


G 


T 


T 


A 


T 


T 


A 


G 


A 


T 


T 


G 


T 


G 


A 


T 


G 


A 


A 


T 


G 


A 


■ 7 07' 




T 


A 


G 


T 


A 


G 


T 


A 


G 


A 


A 


T 


A 


T 


G 


A 


A 


A 


T 


T 


A 


G 


A 


G 


708 




T 


T 


T 


A 


A 


T 


G 


A 


G 


A 


A 


G 


A 


G 


T 


T 


A 


G 


A 


G 


T 


A 


T 


A 


709 




A 


A 


A 


G 


T 


T 


T 


A 


G 


T 


A 


G 


A 


G 


T 


G 


T 


A 


T 


G 


T 


A 


A 


A 


710 




A 


T 


A 


T 


A 


T 


G 


A 


T 


A 


G 


T 


A 


G 


A 


G 


T 


A 


G 


A 


T 


T 


A. 


G 


711 




T 


G 


A 


G 


A 


A 


G 


T 


T 


A 


A 


T 


T 


G 


T 


A 


T 


A 


G 


A 


T 


T 


G 


A 


712 


_ 


T 


A 


T 


A 


G 


A 


G 


A 


T 


G 


T 


T 


A 


T 


A' 


T 


G 


A 


A 


G 


T 


T 


G 


T 


713 




A 


A 


A 


T 


T 


T 


G 


T 


T 


A 


A 


G 


T 


T 


G 


T 


T 


G 


T 


T 


G 


T 


T 


G 


714 




T 


T 


G 


T 


T 


G. 


A 


A 


G 


A 


T 


G 


A 


A 


A 


G 


T 


A 


G 


A 


A 


T 


T 


A 


715 




A 


A 


G 


A 


G 


A 


T 


A 


A 


G 


T 


A 


G 


T 


G 


T 


T 


T 


A 


T 


G 


T 


T 


T 


716 




A 


A 


T 


A 


A" 


G 


A 


A 


G 


A 


A 


G 


T 


G 


A 


A 


A 


G 


A 


T 


T 


G 


A 


T 


717 




T 


A 


A 


G 


T 


T 


A 


A 


A 


G 


T 


T 


G 


A 


T 


G 


A 


T 


T 


G 


A 


T 


A 


G 


718 




A 


T 


A 


T 


A 


A 


G 


A 


T 


A 


A 


G 


A 


G 


T 


G 


T 


A 


A 


G 


T 


G 


A 


T 


719 • 




G 


T 


T 


A 


A 


A 


T 


G 


T 


T 


G 


T 


T 


G 


T 


T 


T 


A 


A 


G 


T 


G 


A 


T 


720 




G 


A 


G 


T 


T 


A 


A 


G 


T 


T 


A 


T 


T 


A 


G 


T 


T 


A 


A 


G 


A 


A 


G 


T 


721 




T 


A 


T 


T 


A 


G 


A 


G 


T 


T 


T 


G 


A 


G 


A 


A 


T 


A 


A 


G 


T 


A 


G 


T 


722 


21 


T 


A 


A 


T 


G 


T 


T 


G 


T 


T 


A 


T 


G 


T 


G 


T 


T 


A 


G 


A 


T 


G 


T 


T 


723 




G 


A 


A 


A 


G 


T 


T 


G 


A 


T 


A 


G 


A 


A 


T 


G 


T 


A 


A 


T 


G 


T 


T 


T 


724 




T 


G 


A 


T 


A 


G 


A 


T 


G 


A 


A 


T 


T 


G 


A. 


T 


T 


G 


A 


T 


T 


A 


G 


T 


725 




A 


T 


G 


A 


T' 


A 


G 


A 


G 


T 


A 


A 


A 


G 


A 


A 


T 


A 


A 


G 


T 


T 


G 


T 


726 




A 


G 


T 


A 


A 


G 


T 


G 


T 


T 


A 


G 


A 


T 


A 


G 


T 


A 


T 


T 


G 


A 


A 


T 


727 


27 


A 


T 


G 


T 


A 


G 


A 


T 


T 


A 


A 


A 


G 


. T 


A 


G 


T 


G 


T 


A 


T 


G 


T 


T 


728 




T 


T 


A 


T 


T 


G 


A 


T 


A 


A 


T 


G 


A 


G 


A 


G 


A 


G 


T 


T 


A 


A 


A 


G 


72 9 




A 


T 


T 


T 


G 


T 


T 


A 


T 


G 


A 


T 


A 


A 


A 


T 


G 


T 


G 


T 


A 


G 


T 


G 


730 


29 


T 


T 


G 


A 


A 


G 


A 


A 


A 


T 


A 


A 


G 


A 


G 


T 


A 


A 


T 


A 


A 


G 


A 


G 


•731 . 




T 


G 


T 


G 


T 


A 


A 


T 


A 


A 


G 


T 


A 


G 


T 


A 


A 


G 


A 


T 


T 


A 


G 


A 


732 




A 


T 


G 


A 


A 


A 


G 


T 


T 


A 


G 


A 


G 


T 


T 


T 


A 


T 


G 


A 


T 


A 


A 


G 


733 


37 


A 


T 


T 


A 


G 


T 


T 


A 


A 


G 


A 


G 


A 


G 


T 


T 


T 


G 


T 


A 


G 


A 


T 


T 


734 




T 


G 


T 


A 


G 


T 


A 


T 


T 


G 


T 


A 


T 


G 


A 


T 


T 


A 


A 


A 


G 


T 


G 


T 


735 




A 


G 


T 


T 


G 


A 


T 


A 


A 


A 


G 


A 


A 


G 


A 


A 


G 


A 


G 


T 


A 


T 


A 


T 


736 ■ 




G 


T 


A 


A 


T 


G 


A 


G 


A 


T 


A 


A 


A 


G 


A 


G 


A 


G 


A 


T 


A 


A 


T 


T 


737 


_ 


T 


G 


T 


G 


• T 


T 


G 


A 


A 


G 


A 


T 


A 


A 


A 


G 


T 


T 


T 


A 


T 


G 


A 


T 


738 


_ 


A 


A 


G 


A 


A 


G 


A 


G 


T 


A 


G 


T 


T 


A 


G 


A 


A 


T 


T 


G 


A 


T 


T 


A 


739 




G 


A 


A 


T 


G 


A 


A 


G 


A 


T 


G 


A 


A 


G 


T 


T 


T 


G 


T 


T 


A 


A 


T 


A 


740 




A 


A 


A 


T 


T 


G 


T 


T 


G 


A 


G 


A 


T 


A 


A 


G 


A 


" T 


A 


G 


T 


G 


A 


T 


741 


_ 


T 


G 


A 


T 


T 


G' 


T 


T 


T 


A 


A 


T 


G 


A 


T 


G 


T 


G 


T 


G 


A 


T 


T 


A 


742 


_ 


A 


T 


G 


A 


A 


G 


T 


A 


T 


T 


G 


T 


T 


G 


A 


G 


T 


G 


A 


T 


T 


T 


A 


A 


743 


_ 


G 


T 


G 


T 


A 


A 


A 


T 


G 


T 


T 


T 


G 


A 


G 


A 


T 


G 


T 


A 


T 


A 


T 


T 


744 


46 


A 


A 


T 


T 


G 


A 


T 


G 


A 


G 


T 


T 


T 


A 


A 


A 


G 


A. 


G 


T 


T 


G 


A 


T 


745 




T 


T 


T 


G 


T 


G 


T 


A 


A 


T 


A 


T 


G 


A 


T 


T 


G 


A 


G 


A 


G 


T 


T 


T 


746 




G 


T 


A 


G 


T 


A 


G 


A 


T 


G 


A 


T 


T 


A 


A 


G 


A 


A 


G 


A 


T 


A 


A 


A - 


747 




T 


T 


T 


A 


A 


T 


G 


T 


G 


A 


A 


A 


T 


T 


T 


G 


T 


T 


G 


T 


G 


A 


G 


T 


: 748 




G 


T 


A 


A 


A 


G 


A 


A 


T 


T 


A 


G 


A 


T 


A 


A 


A 


G 


A 


G 


T 


G 


A 


T 


749 




A 


A 


T 


A 


G 


T 


T 


A 


A 


G 


T 


T 


T 


A 


A 


G 


A 


G 


T 


T 


G 


T 


G 


T 


750 




G 


T 


G 


T 


G 


A 


T 


G 


T 


T 


T 


A 


T 


A 


G 


A 


T 


T 


T 


G 


T 


T 


A 


T 


751 




G 


T 


A 


T 


A 


G 


T 


G 


T 


G 


A 


T 


T 


A 


G 


A 


T 


T 


T 


G 


T 


A 


A 


A 


752 


49 


G 


T 


• T 


G 


T 


A 


A 


G 


A 


A 


A 


G 


A 


T 


A 


T 


G 


T 


A 


A 


G 


A 


A 


A 


753 




A 


T 


A 


T 


T 


A 


G 


A 


T 


T 


G 


T 


A 


A 


A 


G 


A 


G 


A 


G 


T 


G 


A 


A 


754 





105 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



G 


A 


G 


T 


G 


A 


T 


A 


T 


T 


G 


A 


A 


A 


T 


T 


A 


G 


A 


T 


T 


G 


T 


A 


755 




T 


A 


A 


G 


A 


A 


G 


T 


T 


A 


A 


A 


G 


A 


A 


G 


A 


G 


A 


G 


T 


T 


T 


A 


756 




G 


A 


T 


G 


T 


T 


A 


G 


A 


T 


A 


A 


A 


G 


T 


T 


T 


A 


A 


G 


T 


A 


G 


T 


757 




G 


T 


G 


A 


T 


T 


G 


T 


A 


T 


G 


A 


G 


A 


A 


A 


T 


G 


T 


T 


A 


A 


A 


T 


758 




T 


G 


A 


T 


T 


A 


T 


T 


G 


T 


A 


A 


G 


A 


A 


A 


G 


A 


T 


T 


G 


A 


G 


A 


759 




A 


A 


G 


A 


A 


T 


T 


G 


T- 


G 


T 


A 


A 


G 


T 


T 


T 


A 


T 


G 


A 


G 


T 


A 


760 




T 


T 


G 


T 


A 


T 


T 


T 


A 


G 


A 


A 


G 


A 


T 


T 


T 


G 


T 


A 


G 


A 


T 


G 


7 61 




T 


A 


T 


A 


T 


G 


T 


T 


T 


G 


T 


G 


T 


A 


A 


G 


A 


A 


G 


A 


A 


A 


T 


G 


762 




G 


A 


T 


A 


A 


T 


G 


T 


G 


T 


G 


A 


A 


T 


T 


T 


G 


T 


G 


A 


A 


T 


A 


A 


763 




T 


T 


A 


G 


A 


A 


A 


T 


G 


T 


G 


.A 


G 


A 


T 


T 


T 


A 


A 


G 


A 


G 


T 


T 


7 64 




A 


G 


T 


G 


T 


A 


G 


A 


A 


T 


T 


T 


G 


T 


A 


T 


T 


T 


A- 


G 


T 


T 


G 


T 


765 




T 


A 


G 


T 


T 


A 


A 


G 


A 


T 


A 


G 


A 


G 


T 


A 


A 


A 


T 


G 


A 


T 


A 


vj 


/ O D 




G 


A 


A 


G 


T 


G 


A 


T 


A 


T 


T 


G 


T 


A 


A 


A 


T 


T 


G 


A 


T 


A 


A 


G 


767 




G 


T 


A 


A 


T 


T 


G 


T 


G 


T 


T 


A 


G 


A 


T 


T 


T 


A 


A 


G 


A 


A 


G 


T 


76 8 




T 


G 


A 


T 


A 


T 


T 


T 


G 


T 


G 


A 


A 


T 


T 


G 


A 


T 


A 


G 


T 


A 


T 


G 


76 9 




A 


A 


VJ 


T 


A 


A 


A 


G 


A 


G 


A 


T 


A 


T 


A 


G 


T 


T 


A 


A 


G 


T 


T 


c 

VJ 






A 


T 

X 


T 
X 
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A 


T 


A 


A 


G 


A 


T 


T 


T 


G 


A 


T 


G 


T 


A 


G 


T 


G 


T 


A 


G 


T 


853 




T 


A 


T 


G 


T 


T 


T 


G 


T 


T 


G 


T 


T 


G 


T 


T 


A 


A 


G 


T 


T 


T 


G 


A 


854 




G 


A 


T 


A 


G 


T 


T 


T 


A 


G 


T 


A 


T 


A 


G 


A 


A 


G 


A 


T 


A 


A 


A 


G 


855 




G 


T 


T 


G 


A 


A 


T 


A 


T 


A 


G 


A 


G 


A 


T 


A 


G 


T 


A 


A 


A 


T 


A 


G 


856 




A 


G 


A 


G 


A 


A 


G 


A 


T 


T 


T 


A 


G 


T 


•A 


A 


G 


A 


A 


T 


G 


A 


T 


A 


857 




T 


G 


A 


A 


T 


G 


A 


G 


A 


A 


A 


G 


A 


T 


A 


T 


T 


G 


A 


G 


T 


A 


T 


T 


858 




T 


G 


A 


A 


G 


A 


T 


T 


A 


T 


A 


G 


T 


A 


G 


T 


T 


G 


T 


A 


T 


A 


G 


A 


859 




G 


A 


T 


T 


A 


G 


T 


A 


G 


T 


A 


T 


T 


G 


A 


A 


G 


A 


T 


T 


A 


T 


G 


T 


860 




T 


G 


A 


A 


A 


T 


G 


T 


G 


T 


A 


T 


T 


T 


G 


T 


A 


T 


G 


T 


T 


T 


A 


G 


861 


59 


A 


T 


T 


A 


A 


A 


G 


T 


T 


G 


A 


T 


A 


T 


G 


A 


A 


A 


G 


A 


A 


G 


T 


G 


862 





107 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


A 


T ■ 


G 


T 


A 


G 


A 


G 


A 


T 


T 


G 


T 


A 


G 


T 


G 


A 


A 


T 


A 


T 


T 


863 


62 


T 


T 


A 


T 


T 


T 


G 


T 


T 


G 


A 


G 


T 


G 


T 


A 


A 


A 


T 


^G 


T 


G 


A 


T 


864 




A 


T 


G 


T 


A 


A 


T 


T 


G 


T 


G 


A 


A 


T 


A 


A 


. T 


G 


T 


A 


T 


G 


T 


G 


865 


63 


G 


A 


T, 


T 


T 


G 


T 


A 


T 


A 


G 


A 


G 


A 


T 


T 


A 


G 


T 


A 


A 


G 


T 


A 


866 




A 


A 


T 


A 


T 


T 


G 


T 


T 


G 


T 


T 


T 


A 


G 


A 


G 


A 


A 


A 


G 


A 


A 


G 


867 




A 


T 


G 


A 


T 


G 


A 


T 


G 


T 


A 


T 


T 


T 


G 


T 


A 


A 


A 


G 


A 


G 

VJ 


T 


A 


ft fi 

o o o 




A 


A 


T 


G 


T 


A 


T 


T 


T 


G 


T 


G 


T 


G 


A 


T 


T 


G 


T 


G 


T 


A 


A 


A 


86 9 




A 


G 


T 


G 


T 


T 


A 


T 


G 


A 


A 


G 


A 


A 


T 


A 


G 


T 


A 


A 


G 


A 


A 


T 


ft 7 n 




Q 


T 


T 


A 


T 


G 


T 


A 


G 


A 


G 


A 


T 


G 


A 


A 


A 


G 


A 


A 


A 


T 


T 


A . 


ft 7 1 

o / J. 


D -J 


Q 


T 


T 


T 


G 


T 


A 


T 


T 


A 


G 


A 


T 


A 


A 


A 


■ T 


G 


A 


G 


T 


T 


G 


T 


ft 79 




T 


G 


A 


T 




T 


A 


T 


G 


A 


G 


A 


T 


T 


A 


A 


G 


A 


G 


A 


A 


A 

in . 


Vj 


A 
in 


fl7 7 
o / J 




T 


T 


T 


G 


T 

X 


G 

VJ 


T 


G 


T 


T 


A 


T 


T 


G 


T 


A 


A 


T 


T 


G 


A 


G 

VJ 


A 
in 


X 


ft74 


7 0 


Q 


A 


T 


G 


T 

X 


rz 


T 


G 


A 


T 


A 


T 


G 


A 


T 


T 


A 


A 


A 


G 


A 


A 
in 


A 
in 


T 

X 


Q7C 




A 


G 


A 


T 


T 

X 


A 
in 


T 

X 


A 
in. 


G 


A 

in 


T 

X 


T 


T 


G 


T 


A 

in. 


ri 
\j 


A 


rz 


A 


A 


A 
in 


CI 
Uf 


T 

X 


ft 7 £ 




Vj 


A 
in 


A 


G 


A 
in 


vj 


T 

X 


A 


T 


G 


T 


A 


A 


T 


A 


G 


X 


A 
in 


T 

X 


T 

X 


VJ 


X 


A 
in 


X 


ft 7 7 
Off 




T 


T 

X 


T 

X 


G 
vj 


X 


A 
.rl 


A 

in 


T 
X 




T 

X 


T 

X 




X 


T 


G 


A 

in. 


G 


T 

X 


T 

X 


T 

X 


A 
in 


A 
in 


ri 

Vj 


A 
in. 


ft 7 ft 




A 
in 




T 

X 


A 
in 


A 
i"l 


A 
in 


X 


A 
in. 




T 

X 


A 
in 




T 

X 


A 
in 


T 


G 


A 
in 


A 
in 


X 


A 
in 


A 
in 


VJ 


A 
in 


ri 

VJ 


Q 7 Q 




rz 


A 
in 


A 
in 


X 


r» 

Vj 


T 
X 


T 

X 


G 
vj 


A 
in 


A 

in 


T 
X 


T 

X 




A 


A 


A 
in. 


X 


A 
in 


T 

X 


ri 

\J 


A 
in 


cz 

vJ 


T 
X 


T 
X 


ft ft n 




A 
in 


VJ 


X 


A 
in 


vj 


X 


T 

X 


A 
in 


A 


T 
X 


T 

X 


ri 


A 
in 


T 
X 


A 
in 


G 


X 


A 
in 


A 
in 


ri 

vJ 


T 
X 


T 


X 


r« 

vJ 


ft ft 1 
O O X 




A 
in. 


rj 


X 


rz 
vj 


T 
1 


A 
in 


A 


A 
in 


n 
<j 


A 
in 


A 
in 


A 
in. 


X 


C 
\j 


A 
in 


A 
in 


T 




A 
in 


A 

in 


T 

X 


A 
in 


A 


r* 

vj 


poo 
£5 £3 Z 




T 
1 


Vj 


X 


T 

X 


A 
in 


n 


A 


X 


A 
in 


T 

X 


T", 
X 


T 
X 




T 


r; 


A 
in. 


A 
in 


A 
in 


T 
X 


ri 

VJ 


X 


p 

vJ 


A 

in 


A 
in. 


o o j 




X 


rz 

Vj 


X 


A 
in 


T % 

X 


' n 
\j 


X 


J. 




A 
in 


V-J 


X 


T 
x 


T 

X 


n 

vjr 


A 
in 


A 
in 


X 


T 

X 


VJ 


T 
X 


T 
1 


A 
i-i 


1 


pp/i 




T 


rz 
vj 


A 
in 


rz 

Vj 


T 
X 




A 
in 


A 
in 


X 


X 


A 
in. 


ri 


T 

X 


T 

X 


A 
in 


X 




T 

, X 


X 


cz 

VJ 


X 


1 


A 
in. 


1 


P ft c: 




r» 


A • 
in. 


A 
in. 


r« 

Vj 


A 
in 


A 
Pi. 


A 


Vj 


A 

in 


A 
in 


A 
in 


X 


rj 


A 
in 




A 
in 


A 
in 


A 
i-i 


n 


A 
in 


T 
X 


1 


A 
in 


1 


ppC 
O O D 




T 

X 


T 

X 


A 


A 

in 


VJ 


T 

X 


A 
in 


A 
in 


ri 
<j 


X 


X 


r; 

w 






T 


X 




A 
in 


T 

X 


A 
in 


T 

X 


T 
X 


A 
in 


Vj 


ft ft 7 




A 

in 


T 

X 


VJ 


A 
in 


T 

X 


rz 


X 


ri 

Vj 


T 


T 


X 


G 


A 


T 


T 


X 


ri 


A 
in 


A 
in 


T 


T 


ri 
Vj 


A 
in 


A 

in. 


ft ft ft 
D O o 


79 


A 
c\ 


A 
in 


vj 


x 


A 


A 
r\ 


ri 

VJ 


X 




A 
in 


A 
in 


A 
in. 


T 
X 


T 




x 


T 
x 


ri 
\-J 


T 
X 


T 
X 


T 


CZ 


A 

in 


A 
in 


ft ft Q 
0 0 




A 
in 


T 


ri 

Vj 


A 


A 
in 


r* 

<j 


T 

X 


G 


T 

X 


A 
in 


A 
in 


A 
in 


ri 
\j 


T 


T 


T 
x- 


ri 
\j 


A 
in 


A 
in 


A 
in 


cz 

vJ 


A 
in 


A 
in 


A 
in 


ft q n 

O U 




A 
in 


rz 
vj 


A 
in. 


G 

VJ 


A 
in 


CZ 


T 

X 


A 
in 


A 
in 


G 


A 
in. 


T 
X 


A 
in. 


A 
in 


T 


X 


n. 

. w 


T 

X 


A 
in. 


T 
X 


A 
in 


VJ 


X 


A 
in 


ft Q1 

O J7 ± 




T 
X 


T 
x 


T 


A 


T 

X 


ri 
vj 


A 

in 




A 
in. 


T 

X 


A 
in 




A 
in 


T 
X 


G 


A 

in 


A 
in 


A 
in 


T 
X 


A 
in 


A 
in 


ri 

VJ 


X 


Vj 


ft Q9 




A 


G 


A 


A 


A 
in 


X 


T 

X 


A 
in 


G 


T 


A 


G 


T 


A 


A 


X 


G 


A 


T 


T 


T 


CZ 

VJ 


T 

X 


Vj 


ft Q ^ 

O J J 




Q 


A 


T 


T 


T 
X 


rz 


A 
in. 




A 


T 


T 

X 


G 


A 


A 


T 


G 


A 




A 


A 


T 


A 
in 


X 


A 

i-i 


ft Q4 

O -/ rt 




G 


A 


T 


T 


A 


rz 

vj 


A 


A 


A 


G 


A 


T 


G 


A 


A 


T 


A 


A 


A 


G 


A 


T 
X 


G 

vJ 


A 

rn 


ft Q ^ 




T 


A 


G 


A 


T 


A 


G 


A 


A 


A 


G 


T 


A 


T 


A 


T 


G 


T 


T 


G 


T 


A 


G 


T 
X 


ft Qfi 

O O 




G 


A 


A 


G 


A 


T 


A 


G 


T 


A 


A 


A 


G 


T 


A 


A 


A 


G 


T 


A 


A 


G 


T 


T 


ft 97 

O Zf / 




A 


A 


A 


T 


G 


T 


G 


T 


G 


T 


T 


T ■ 


A 


G 


T 


A 


G 


T 


T 


G 


T 


A 


A 


A 


ft 9ft 

O JO 


75 


T 


T 


G 


T 


T 


G 


A 


A 


G 


T 


A 


A 


G 


A 


G 


A 


T 


G 


A 


A 


T 


A 


A 


A 


8 99 




T 


A 


T 


T 


T 


G 


A 


G 


A 


G 


A 


A 


A 


G 


A 


A 


A 


G 


A 


G 


T 


T 


T 


A 


900 




T 


A 


T 


T 


T 


A 


G 


T 


G 


A 


T 


G 


A 


A 


T 


T 


T 


G 


T 


G 


A 


T 


G 


T 


901 




T 


T 


A 


T 


A 


G 


T 


G 


A 


T . 


G 


A 


T 


G 


A 


T 


A 


A 


G 


T 


T 


G 


A 


T 


902 




T 


A 


A 


A 


G 


A 


T 


A 


A 


T 


T 


G 


T 


A 


G 


A 


A 


A 


G 


T 


A 


G 


T 


G 


903 




G 


T 


T 


T 


A 


G 


T 


A 


T 


T 


G 


A 


T 


A 


T 


T 


G 


T 


G 


T 


G 


T 


A 


A 


904 




G 


T 


G 


T 


T 


G 


T 


G 


A 


A 


T 


A 


A 


G 


A 


T 


T 


G 


A 


A 


A 


T 


A 


T ■ 


905 




A 


A 


A 


G 


A 

in 


A 

in 


A 




T 


A 

in 


T 


A 


A 


A 


G 


T 


<j 


A 


G 


A 


T 


A 
in 


G 

VJ 


A 

in 


9fl 




T 


A 


T 


T 


T 


G 


T 


A 


A 


G 


A 


A 


G 


T 


G 


T 


A 


G 


A 


T 


A 


T 


T 


G 


907 




T 


A 


G 


A 


A 


G 


A 


T 


G 


A 


A 


A 


T 


T 


G 


T 


G 


A 


T 


T 


T 


G 


T 


T 


908 




A 


T 


A 


A 


T 


A 


G 


T 


A 


A 


G 


T 


G 


A 


A 


T 


G 


A 


T 


G 


A 


G 


A 


T 


909 




A 


A 


T 


G 


T 


G 


A 


A 


T 


A 


A 


G 


A 


T 


A 


A 


A 


G 


T 


G 


T 


G 


T 


A 


910 




A 


T 


T 


G 


A 


A 


G 


A 


T 


A 


A 


A 


G 


A 


T 


G 


T 


T 


G 


T 


T 


T 


A 


G 


911 




T 


G 


A 


A 


A 


T 


A 


G 


A 


A 


G 


T 


G 


A 


G 


A 


T 


T 


A 


T 


A 


G 


T 


A 


912 


76 


A 


G 


T 


T 


A 


T 


T 


G 


T 


G 


A 


A 


A 


G 


A 


G 


T 


T 


T 


A 


T 


G 


A 


T 


913 




A 


A 


A 


T 


A 


G 


T 


A 


G 


T 


G 


A 


T 


A 


G 


A 


G 


A 


A 


G 


A 


T 


T 


T 


914 




A 


G 


T 


G 


T 


A 


T 


G 


A 


A 


G 


T 


G 


T 


A 


A 


T 


A 


A 


G 


A 


T 


T 


A 


915 




T 


G 


A 


T 


T 


A 


A 


G 


A 


T 


T 


G 


T 


G 


T 


A 


G 


T 


G 


T 


T 


A 


T 


A 


916 





108 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


G 


T 


T 


T 


A 


T 


G 


A 


T 


A 


T 


T 


T 


G 


T 


A 


G 


A 


T 


G 


A 


G 


T' . 


917 




T 


A 


T 


G 


T 


G 


T 


A 


T 


G 


A 


A 


G 


A 


T 


T 


A 


T 


A 


G 


T 


T 


A 


G 


918 


78 


G 


A 


A 


A 


T 


T 


G 


T 


T 


G 


T 


A 


T 


A 


G 


A 


G 


T 


G 


A 


T 


A 


T 


A 


919 


_ 


T 


A 


G 


A 


A 


A 


T 


A 


G 


T 


T 


T 


A 


A 


G 


T 


A' 


T 


A 


G 


T 


G 


T 


G 


920 


_ 


T 


G 


A 


T 


T 


T 


A 


G 


A. 


T 


G 


T 


T 


T 


A 


T 


T 


G 


T 


G 


A 


G 


A 


A- 


921 




A 


A 


G 


T 


T 


G 


A 


T 


A 


T 


T 


T 


G 


T 


T 


G 


T 


T 


A 


G 


A 


T 


G 


A 


922 


_ 


T 


G 


A 


T 


G 


T 


G 


A 


T 


A 


A 


T 


G 


A 


G 


A 


A 


T 


A 


-A 


A 


G 


A 


A 


923 


79 


A 


A 


A 


G 


T 


T 


T 


A 


G 


T 


T 


T 


G 


T 


A 


T 


T 


A 


G 


T 


A 


G 


A 


G 


. 9-24 




A 


G 


T 


T 


T 


G 


A 


T 


G 


T 


G 


A 


T 


A 


G 


T 


A 


A 


A 


T 


A 


G 


A 


A 


925 


_ 


A 


A 


G 


T 


G 


T 


T 


A 


T 


T 


G 


A 


A 


T 


G 


T 


G 


A 


T 


G 


T 


T 


A 


T 


926 


_ 


A 


A 


A 


T 


T 


G 


A 


A 


G 


T 


G 


T 


G 


A 


T 


A 


A 


T 


G 


T 


T 


T 


G 


T 


927 




G 


T 


T 


T 


A 


G 


T 


G 


A 


T 


T 


A 


A' 


A 


G 


A 


T 


A 


G 


A 


T 


T 


A 


G 


92 8 


82 


A 


T 


A 


A 


G 


T 


G 


T 


A 


T 


A 


A 


G 


A 


G 


A 


A 


G 


T 


G 


T 


T 


A 


A 


929 




A 


T 


G 


A 


A 


T 


T 


T 


G 


T 


T 


T 


G 


T 


G 


A 


T 


G 


A 


A 


G 


T 


T 


A 


930 


_ 


A 


A 


A 


G 


A 


A 


T 


T 


G 


A 


G 


A 


A 


A 


T 


G 


A 


A 


A 


G 


T 


T 


A 


G 


931 


_ 


A 


G 


T 


G 


T 


A 


A 


G 


A 


G 


T 


A 


T 


A 


A 


A 


G 


T 


A 


T 


T 


T 


G 


A 


932 


_ 


G 


A 


A' 


T 


T 


A 


A 


G 


A 


T 


T 


G 


T 


T 


A 


T 


A 


T 


G 


T 


G 


A 


G 


T 


933 




T 


A 


T 


G 


A 


A 


A 


G 


T 


G 


T 


T 


G 


T 


T 


T 


A 


A 


G 


T 


A 


A 


G 


A 


934 


_ 


T 


A 


A 


A 


G 


T 


A 


A 


A 


T 


G 


T 


T 


A 


T 


G 


T 


G 


A 


G 


A 


G 


A 


A 


935 


_ 


A 


A 


A 


G 


A 


T 


A 


T 


T 


G 


A 


T 


T 


G 


A 


G 


A 


T 


A 


G 


A 


G 


T 


T 


936 


_ 


A 


A 


G 


T 


G 


A 


T 


A 


T 


G 


A 


A 


T 


A 


T 


G 


T 


G 


A 


G 


A 


A 


A 


T 


937 


'_ 


A 


A 


A 


T 


A 


G 


A 


G 


T 


T 


T 


G 


T 


T 


A 


A 


T 


G 


T 


A 


A 


G 


T 


G 


938 


- 


G 


A 


T 


T 


T 


A 


G 


A 


T 


G 


A 


G 


T 


T 


A 


A 


G 


A 


A 


T 


T 


T 


A 


G 


939 


• _ 


T 


T 


G 


T 


A 


A 


A 


T 


G 


A 


G 


T 


G 


T 


G 


A 


A 


T 


A 


T 


T 


G 


T 


A 


940 . 


_ 


A 


G 


T 


A 


G 


T 


G 


T 


A 


T 


T 


T 


G 


A 


G 


A 


T 


A 


A 


T 


A 


G 


A 


A 


941 




T 


G 


A 


G 


T 


T 


A 


A 


A 


G 


A 


G 


T 


T 


G 


T 


T 


G 


A 


T 


A 


T 


T 


T 


942 




A 


A 


A 


G 


A 


G 


T 


G 


T 


A 


T 


T 


A 


G 


A 


A 


A 


T 


A 


G 


T 


T 


T 


G 


943 




G 


T 


T 


T 


A 


G 


T 


T 


A 


T 


T 


T 


G 


A 


T 


G 


A 


G 


A 


T 


A 


A 


T 


G 


944 


_ 


A 


A 


G 


T 


G 


T 


A 


A 


A 


T 


G 


A 


A 


T 


A 


A 


A 


G 


A 


G 


T 


T 


G 


T 


945 




A 


A 


T 


A 


A 


A 


G 


T 


G 


A 


G 


T. 


A 


G 


A 


A 


G 


T 


G 


T 


A 


A 


T 


T 


946 


_ 


T 


A 


T 


T 


G 


A 


G 


T 


T 


T 


G 


T 


G 


T 


A 


A 


A 


G 


A 


A 


G 


A 


T 


A 


947 




T 


T 


T 


A 


T 


A 


G 


T 


T 


G 


T 


T 


G 


T 


G 


T 


T 


G 


A 


A 


A 


G 


T 


T 


948 




A 


T 


G 


A 


A 


A 


T 


A 


T 


G 


A 


T 


T 


G 


T 


G 


T 


T 


T 


G 


T 


T 


G 


T 


949 




A 


A 


A 


G 


A 


G 


A 


T 


G 


T 


A 


A 


A 


G 


T 


G 


A 


G 


T 


T 


A 


T 


T 


A 


950 


_ 


T 


T 


G 


A 


A 


G 


A 


A 


A 


G 


T 


T 


A 


G 


A 


T 


G 


A 


T 


G 


A 


A 


T 


T 


951 


_ 


A 


T 


G 


t 


T 


A 


T 


T 


T 


G 


T 


T 


T 


A 


G 


T 


T 


T 


G 


T 


G 


T 


G 


A 


952 


_ 


A 


A 


A 


T 


A 


T 


G 


A 


A 


T 


T 


T 


G 


A 


A 


G 


A 


G 


A 


A 


G 


T 


G 


A 


953 




G 


A 


T 


T 


A 


G 


A 


T 


A 


T 


A 


G 


A 


A 


T 


A 


T 


T 


G 


A 


A 


G 


A 


G 


954. 




T 


T 


A 


G 


A 


A 


T 


A 


A 


G 


A 


G 


A 


A 


A 


T 


G 


T 


A 


T 


G 


T 


G 


T 


955 


_ 


T 


T 


T 


A 


T 


G 


A 


A 


A 


G 


A 


G 


A 


A 


G 


T 


G 


T 


A 


T 


T 


A 


T 


G 


956 


_ 


G 


T 


A 


A 


G 


T 


A 


T 


T 


A 


A 


G 


T 


G 


T 


G 


A 


T 


T 


T 


A 


G 


T 


A 


957- 


_ 


A 


T 


A 


A 


A 


G 


A 


G 


A 


A 


G 


T 


A 


A 


A 


G 


A 


G 


T 


A 


A 


A 


G 


T 


958 - 


. _ 


A 


T 


T 


G 


T 


T 


A 


A 


T 


T 


G 


A 


A 


G 


T 


G 


T 


A 


T 


G 


A 


A 


A 


G 


959 


_ 


T 


A 


T 


A 


T 


A 


G 


T 


T 


G 


A 


G 


T 


T 


G 


A 


G 


T 


A 


A 


G 


A 


T 


T 


' 960 




T 


A 


G 


A 


T 


G 


A 


G 


A 


T 


A 


T 


A 


T 


G 


A 


A. 


A 


G 


A 


T 


A 


G 


T 


961 




A 


T 


A 


A 


G 


A 


A 


G 


A 


T 


G 


A 


T 


T 


T 


G 


T 


G 


T 


A 


A 


A 


T 


G 


962 - 




T 


T 


A 


G 


T 


A 


A 


T 


A 


A 


G 


A 


A 


A 


G 


A 


T 


G 


A 


A 


G 


A 


G 


A 


963 




G 


A 


.T 


T 


T 


G 


T 


G 


A 


G 


T 


A 


A 


A 


G 


T 


A 


A 


A 


T 


A 


G 


A 


A 


964 




A 


A 


A 


T 


A 


G 


A 


T 


G 


T 


A 


G 


A 


A 


T 


T 


T 


G 


T 


G 


T 


G 


T 


T 


965 




G 


A 


A 


A 


T 


T 


A 


G 


T 


G 


T 


T 


T 


G 


T 


G 


T 


G 


T 


'A 


T 


T 


A 


T 


966 




A 


T 


T 


T 


G 


A 


G 


T 


A 


T 


G 


A 


T 


A 


G 


A 


A 


G 


A 


T 


T 


G 


T 


T 


967 




A 


T 


A 


G 


A 


G 


T 


T 


G 


A 


A 


G 


T 


A 


T 


G 


T 


A 


A 


A 


G 


T 


T 


T 


968 




T 


A 


A 


T 


T 


T 


G 


T 


G 


A 


A 


T 


G 


T 


T 


G 


T 


T 


A 


T 


T 


G 


T 


G 


• 969 




T 


T 


A 


G 


T 


T 


T 


A 


T 


G 


A 


G 


A 


G 


T 


G 


A 


G 


A 


T 


T 


T 


A 


A 


970 





109 



Table II 

Sequence SEQ ID NO: No. in 

Ex.4 



G 


T 


T 


G 


T 


T 


A 


G 


A 


G 


T 


G 


T 


T 


T 


A 


T 


G 


A 


A 


A 


T 


T 


T 


971 




T 


T 


T 


A 


T 


T 


G 


T 


G 


A 


T 


G 


T 


G 


A 


A 


A 


T 


A 


A 


g' 


A 


G 


A 


972 . 


_ 


G 


T 


A 


A 


G 


T 


A 


A 


T 


A. 


T 


G 


A 


T 


A 


G 


T 


G 


A 


T 


T 


A 


A 


G 


973 . 


_ 


T 


G 


A 


G 


A 


T 


G 


A 


T 


G 


T 


A 


T 


A 


T 


G 


T 


A 


G 


T 


A 


A 


T 


A 


974 


_ 


A 


A 


T 


T 


G 


A 


G 


A 


A 


A 


G 


A 


G 


A 


T 


A 


A 


A 


T 


G 


A 


T 


A 


G 


975 


85 


T 


T 


T 


G 


A 


A 


G 


T 


G 


A 


T 


G 


T 


T 


A 


G 


A 


A 


T 


G 


T 


T 


T 


A 


976 




A 


G 


T 


T 


G 


T 


T 


G 


T 


G 


T 


A 


A 


T 


T 


G 


T 


T 


A 


G 


T 


A 


A 


A 


977 




A 


T 


A 


G 


T 


G 


A 


G 


A 


A 


G 


T 


G 


A 


T 


A 


A 


G 


A 


T 


A 


T 


T 


T 


978 


_ 


G 


T 


G 


T 


G 


A 


T 


A 


A 


G 


T 


A 


A 


T 


T 


G 


A 


G 


T 


T 


A 


A 


A 


T 


979 




T 


A 


G 


T 


T 


A 


T 


T 


G 


T 


T 


T 


G 


T. 


G 


A 


A 


T 


T. 


T 


G 


A 


G 


A 


980 




A 


T 


A 


G 


T 


T 


G 


A 


A 


T 


A 


G 


T 


A 


A 


T 


T 


T 


G 


A 


A 


G 


A 


G 


" 981 




A 


T 


G 


T 


T 


T 


G 


T 


G 


T 


T 


T 


G 


A 


A 


T 


A 


G 


A 


G 


A 


A 


T 


A 


982 




T 


G 


A 


T 


A 


A 


A 


G 


A 


T 


A 


T 


G 


A 


G 


A 


G 


A 


T 


T 


G 


T 


A 


A 


983 




T 


A 


A 


A 


G 


A 


T 


G 


A 


G 


A 


T 


G 


T 


T 


G 


T 


T 


A 


A 


'A 


G 


T 


T 


" '984 




A 


A 


G 


T 


G 


A 


A 


A 


T 


T 


T 


G 


T 


A 


A 


G 


A 


A 


T 


T 


A 


G 


T 


G 


985 




G 


A 


A 


A 


T 


G 


A 


G 


A 


G 


T 


T 


A 


T 


T 


G 


A 


T 


A 


G 


T 


T 


T 


A 


986 




T 


T 


T 


G 


T 


A 


A 


A 


T 


G' 


A 


G 


A 


T 


A 


T 


A 


G 


T 


G 


T 


T 


A 


G 


987 




G 


T 


T 


A 


A 


T 


T 


G 


T 


G 


A 


T 


A 


T 


T 


T 


G 


A 


T 


T 


A 


G 


T 


G 


988 




A 


G 


A 


G 


T 


G 


T 


T 


G 


A 


T 


A 


A 


A 


G 


A 


T 


G 


T 


T 


T 


A 


T 


A 


989 ■ 




A 


A 


T 


T 


G 


T 


G 


A 


G 


A 


A 


A 


T 


T 


G 


A 


T 


A 


A 


G 


A 


A 


G 


A 


990 




T 


T 


A 


A 


A 


G 


A 


G 


A 


A 


T 


T 


G 


A 


G 


A 


A 


G 


A 


G 


A 


A 


A 


T 


991 




T 


T 


G 


T 


T 


A 


G 


A 


A 


G. 


A 


A 


T 


T 


G 


A 


A 


T 


G 


T 


A 


T 


G 


T 


992 




A 


G 


T 


T 


A 


A 


G - 


A 


T 


A 


T 


G 


T 


G 


T 


G 


A 


T 


G 


T 


T 


T 


A 


A 


993 




T 


G 


A 


G 


T 


T 


A 


T 


G 


T 


T 


G 


T 


A 


A 


T 


A 


G 


A 


A 


A 


T 


T 


G 


994 




T 


T 


A 


G 


A 


T 


A 


A 


G 


T 


T 


T 


A 


G 


A 


G 


A 


T 


T 


G 


A 


G 


A 


A 


' 995 




A 


T 


G 


A 


G 


T 


A 


A 


T 


A 


A 


G 


A 


G_ 


T 


A 


T 


T 


T 


G 


A 


A 


G 


T 


996 




T 


G 


T 


T 


T 


A 


A 


G 


T 


G 


T 


A 


A 


T 


G 


A 


T 


T 


T 


G 


T 


T 


A 


G 


997 




T 


T 


G 


-A 


A 


G 


A 


A 


G 


A 


T 


T 


G 


T 


T 


A 


T 


T 


G 


T 


T 


G 


A 


A 


998 




T 


A 


T 


A 


G 


A 


A 


A 


G 


A 


T 


T 


A 


A 


A 


G 


A 


G 


T 


G 


A 


A 


T 


G 


999 




T 


A 


A 


A 


T 


T 


G 


T 


T 


A 


G 


A 


A 


A 


T 


T 


T 


G 


A 


G 


T 


G 


T 


G 


1000 




A 


T 


T 


G 


T 


T 


A 


G 


T 


G 


T 


G 


T- 


T 


A 


T 


T 


G 


A 


T 


T 


A 


T 


G 


1001 




G 


A 


G 


A 


A 


T 


T 


A 


T 


G 


T 


G 


T 


G 


A 


A 


T 


A 


T 


A 


G 


A 


A 


A 


1002 




T 


T 


G 


A 


T 


T 


G 


A 


T 


A 


A 


A 


G 


T. 


A 


A 


A 


G 


A 


G 


T 


G 


T 


A ' 


1003 




G 


T 


G 


T 


G 


T 


A 


A 


A 


T 


T 


G 


A 


A 


T 


A 


T 


G 


T 


T 


A 


A 


T 


G 


1004 




A 


A 


A 


G 


T 


A 


A 


A 


G 


A 


A 


A 


G 


A 


A 


G 


T 


T 


T 


G 


A 


A 


A 


G 


1005 




T 


T 


T 


A 


G 


T 


T 


G 


A 


A 


G 


A 


A 


T 


A 


G 


A 


A 


A 


G 


A 


A 


A 


G 


1006 


_ 


G 


T 


G 


T 


A 


A 


T 


A 


A 


G 


A 


G 


T 


G 


A 


A 


T 


A 


G 


T 


A 


A 


T 


T 


1007 


_ 


T 


A 


T 


T 


G 


A 


A 


A 


T 


A 


A 


G 


A 


G 


A 


G 


A 


T 


T 


T 


G 


T 


G 


A 


1008 


_ 


A 


T 


G 


A 


G 


A 


A 


A 


G 


A 


A 


g' 


A 


A 


G 


T 


T 


A 


A 


G 


A 


T 


T. 


T 


1009 


_ 


A 


A 


G 


A 


G 


T 


G 


A 


G 


T 


A 


T 


A 


T 


T 


G 


T 


T 


A 


A 


A 


G 


A 


A . 


1010 


_ 


T 


T 


T 


G 


T 


A 


A 


A 


G 


T 


g' 


A 


T 


G 


A 


T 


G 


T 


A 


A 


G 


A 


T 


A 


1011 . 


_ 


G 


A 


T 


G 


T 


T 


A 


T 


G 


T 


G 


A 


T 


G 


A 


A 


A 


T 


A 


T 


G 


T 


A 


T 


1012 


_ 


G 


T 


A 


G 


A 


,A 


T 


A 


A 


A 


G 


T. 


G 


T 


T 


A 


A 


A 


G 


T 


G 


T 


T 


A 


1013 


_ 


A 


A 


A 


G 


A 


G 


T 


A 


T 


G 


T 


G 


T 


G 


T 


A 


T 


G 


A 


T 


A 


T 


T 


T 


1014 


■ 


A 


A 


A 


G 


A 


T 


A 


A 


G 


A 


G 


T 


T . 


A 


G 


T 


A 


A 


A 


T 


T 


G 


T 


G 


1015 ' 




A 


A 


G 


A 


A 


T 


T 


A 


G 


A 


G 


A" 


A 


T 


A 


A 


G 


T 


G 


T 


G 


A 


T 


A 


1016 




G 


A 


T 


A 


A 


G 


A 


A 


A 


G 


T 


G 


A 


A 


A 


T" 


G- 


T 


A 


A 


A 


T 


T 


G 


1017 


86 


G 


A 


T 


■ G 


A 


A 


A 


G 


A 


T 


G 


T 


T 


T 


A 


A 


A 


G 


T 


T 


T 


G 


T 


T 


- 1018 




A 


G 


T 


G 


T 


A 


A 


G 


T 


A 


A 


T" 


A 


A 


G 


T 


T. 


T 


G 


A 


G 


A 


A 


A 


1019. 




G 


T 


T 


G 


A 


G 


A 


A 


T 


T 


A 


G 


A 


A 


T 


T 


T 


G 


A 


T 


A 


A 


A 


G 


1020 


87 


T 


T 


A 


A 


G 


A 


A 


A 


T 


T 


T 


G 


T 


A 


T 


G 


T 


G 


T 


T 


G 


T 


T 


G 


1021 




A 


G 


A 


A 


G 


A 


T 


T 


T 


A 


G 


A 


T 


G 


A 


A 


A 


T 


G 


A 


G 


T 


T 


T 


1022 




T 


A 


A 


G 


T 


T 


T 


G 


A 


G 


A 


T 


A 


A 


A 


G 


A 


T 


G 


A. 


T 


A 


T 


G 


1023 




T 


G 


A 


G 


A 


T 


A 


G 


T 


T 


T 


G 


T 


A 


A 


T 


A 


T 


G 


T 


T 


T 


G 


T 


1024 





110 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



A 


G 


T 


T 


T 


G 


A 


A 


A 


T 


T 


G 


T' 


A 


A 


G 


T 


T 


T 


G 


A 


T 


G 


A 


1025 




T 


A 


G 


A 


A 


T 


T 


G 


A 


T 


T 


A 


A 


T 


G 


A 


T 


G. 


A 


G 


T 


A 


G 


T 


102 6 




A 


G 


A 


G 


A 


T 


T 


T 


G 


T 


A 


A 


T 


A 


A 


G 


T 


A 


T 


T 


G 


A 


A 


G 


102 7 




A 


T 


A 


A 


T 


G 


A 


T 


G 


T 


A 


A 


T 


G 


T 


A 


A 


G 


T 


A 


G 


T 


G 


T 


102 8 




T 

X 


G 


A 


A 


A 


T 


T 


T 


G 


A 


T 


G 


A 


G 


A 


G 


A 


T 


A 


T 


G 


T 


T 


A 


102 9 




T 

X 


G 


T 


G 


X 


A 
in. 


A 


A 


G 


T 


A 


T 


A 


G 


T 


T 


T 


A 


-T 


G 


T 


T 


A 


G 






x 


r* 






T 

X 


in. 


A 


G 


T 


G 


A 


A 


A 


T 


A 


G 


A 


A 


T 


G 


A 


A 


T 


G 


1U J 1 




A 


A 


A 


G 


A 
in. 


A 


A 


G 


A 


T 


T 


G 


T 


A 


A 


T 


A 


A 


G 


T 


A 


G 


A 


G ' 


JLv J « 




A 
M. 


A 
Pi 


T 

X 


p 


A 
A 


A 
in. 


A 

Pl 


T 

X 


A 
r\ 


p 


T 

X 


p. 


T 


T 


A 


A 


A 
r\ 


T 

X 


p 


A 
r\ 


p 


T 

X 


p 


T 


i n 7 ^ 


ft Q 


p 


T 
X 


A 
in. 


p 


A 
in. 


X 


A 

M. 


A 
r\ 


A 


p 

w 


A 
r\ 


T 


G 


T 


G 


A 


A 


T 


T 


A 


T 


p 


A 
Pi. 


T 


X w J *i 






A 
in. 


X 


A 
in. 


P 
Li 


T 
X 


A 
in 


T 
X 


A 


X 


P 


X 


p 


T 




X 


A 


T 

X 


T 

X 


T 

X 


p 


T 

X 


T 


T 


i n >n 




A 
M. 


X 


p 


X 


X 


X 


P 


T 

X 


A 
P\. 


p 


A 


A 


A 
r\ 


T 

X 




X 


T 

X 


X 


p 

w 


A 
in. 


A 
in. 


p 

Vjr 


A 
in. 


T 
1 


X U j o 




A 
in. 


A 

M. 


A 
in. 


T 

X 


X 


T 
X 


P 
Li 


T 

X 


A 


G 


A 
J-i 


P 

V_7 


A 


G 


A 


A 


A 


T 


T 

X 


X 


G 


X 


T 
1 


p 

vj 


i 

X U J / 




T 


in. 


P 


A 

in. 


A 
in. 


X 


A 
in. 


A 
J-\. 


P 


A 
r\ 


X 


T 
1 


A 
jrt. 


G 


T 


A 


A 




T 

X 


P 

VJ7 


T 


A 
P\ 


P 


A 
Pi 


XU J o 




T 

X 


p 


A 
in. 


T 

X 


X . 


T 
1 


A 
in. 


p 


A 
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G 


T 


G 


A 


A 


T 


A 


A 


T 


G 


A 


A 


G 


A 


A 


G 


T 


T 


A 


T 


G 


T 


T 


A 


1125 


98 


T 


T 


G 


T 


G 


A 


A 


T 


A 


A 


A 


G 


T 


A 


G 


A 


T 


G 


T 


G 


T 


T 


A 


T 


1126 




T 


T 


A 


T 


A 


T 


G 


A 


T 


A 


T 


G 


A 


G 


T 


T 


T 


G 


T 


G 


T 


T 


G 


A 


.1127 




T 


T 


G 


A 


T 


T 


T 


G 


T 


G 


T 


G 


A 


G 


T 


A 


T 


T 


A 


G 


T 


T 


A 


T 


1128 




A 


A 


A 


G. 


T 


G 


A 


T 


T 


A 


A 


G 


T 


T 


A 


G 


T 


T 


T 


G 


A 


G 


A 


T 


1129 




T 


T 


G 


T 


A 


T 


T 


T 


G 


T 


A 


T 


A 


A 


T 


G 


T 


T 


G 


A 


A 


G 


A 


G 


1130 




G 


T 


T 


T 


G 


A 


A 


A 


T 


T 


A 


G 


T 


G 


T 


G 


A 


G 


A 


A 


A 


T 


A 


T 


1131 




A 


A 


T 


G 


T 


T 


G 


A 


G 


A 


T 


T 


G 


A 


T 


A 


A 


T 


G 


T 


T 


G 


A 


A 


1132 





112 



Table II 

Sequence SEQ ID NO: No. in 

Ex 4 



T 


A 


G 


T. 


A 


G 


T 


A 


G 


T 


A 


T 


T 


G 


T 


T 


G 


T 


A 


A 


T 


A 


A 


G 


1133 




G 


T 


T 


G 


T 


A 


A 


T 


T 


T 


G 


A 


G 


T 


G 


T 


T 


A 


G 


T 


T 


A 


T 


T 


1134 




T 


G 


A 


A 


T 


A 


T 


G 


A 


T 


A 


G 


T 


T 


A 


G 


T 


A 


A 


T 


T 


G 


T 


G 


1135 


- 


T 


G 


A 


T 


A 


G 


T 


A 


T 


G 


T 


T 


T. 


G 


T 


G 


A 


T 


T 


A 


A 


A 


G 


A 


1136 


- 


G 


A 


T 


G 


T 


A 


T 


A 


A 


A 


G 


A 


G 


T 


A 


T 


G 


T 


T 


A 


T 


A 


A 


G 


1137 


- 


A 


G 


T 


G 


A 


G 


A 


T 


T 


T 


A 


G 


A 


A 


G 


A 


T 


G 


T 


T 


A 


T 


T 


A 


1138 


- 


A 


T 


G 


A. 


G 


A 


A 


T 


T 


T 


G 


T 


T 


A 


A 


A 


G 


A 


G 


A 


A 


A 


G 


T 


1139 


- 


A 


A 


A 


G 


A 


A 


T 


T 


A 


G 


T 


A 


T 


G 


A 


T 


A 


G 


A 


T 


G 


A 


G 


A 


1140 


99 


T 


A 


G 


A 


G 


T 


T 


G 


T 


A 


T 


A 


G 


T 


T 


T 


A 


T 


A 


G 


T 


T 


G 


A 


1141 


- 


G 


T 


A 


G 


A 


A 


T 


G 


A 


T 


T 


G 


T 


T 


T 


A 


G 


A 


A 


G. 


A 


T 


T 


T 


1142 


- 


G 


T 


T 


T 


A 


T 


G 


T 


T 


T 


G 


A 


G 


A 


A 


G 


A 


G 


T 


T 


A 


T 


T 


T 


1143 


- 


T 


A 


G 


A 


A 


G 


T 


T 


T 


G 


A 


A 


A 


G 


T 


T 


A 


T 


T 


G 


A 


T 


T 


G 


1144 


- 


G 


A 


T 


G 


A 


A 


G 


A 


G 


T 


A. 


T 


T 


T 


G 


T 


T 


A 


T 


A 


T 


G 


T 


A 


• 1145 


- 


G 


A 


T 


G 


A 


A 


T 


A 


T 


A 


G 


T 


A 


A 


G 


T 


A 


T 


T 


G 


A 


G 


T 


A 


1146 


100 


T 


A 


G 


T 


G 


A 


T 


G 


A 


A 


A 


T 


T 


T 


G 


A 


G 


A 


T 


A 


G 


A 


T 


A 


1147 


- 


G 


A 


A 


A 


G 


A 


A 


A 


T 


T 


G 


A 


A 


G 


A 


G 


T 


T 


T 


G 


A 


T 


A 


T 


1148 


- 


A 


T 


T 


T 


G 


A 


G 


T 


A 


T 


T 


T 


G 


T 


G 


T 


A 


T 


T. 


G 


A 


A 


T 


G 


1149 




A 


T 


G 


A 


G 


T 


T 


G 


A 


A 


A 


T 


T 


T 


G 


A 


A 


G 


T 


A 


T 


T 


G 


T 


1150 




T 


T 


A 


A 


T 


A 


G 


T 


G 


A 


G 


A 


G 


A 


G 


T 


A 


T 


A 


T 


G 


T 


A 


A 


1151 




A 


T 


T 


A 


A 


G 


A 


G 


A 


G 


T 


G 


A 


G 


T 


A 


A 


A 


T 


G 


T 


A 


A 


A 


1152 


- 


A 


A 


G 


A 


A 


T 


A 


G 


A 


T 


G 


A 


G 


A 


T 


T 


A 


G 


A 


A 


A 


T 


A 


G 


1153 


_ 


A 


G 


T 


T 


T 


A 


A 


A 


G 


A 


G 


T 


T 


A 


G 


A 


A 


T 


T 


G 


A 


A 


A 


G 


1154 


_ 


G 


T 


A 


A 


G 


A 


T 


T 


T 


G 


T 


T 


G 


A 


A 


T 


A 


A 


A 


G 


A 


A 


G 


A 


1155 


_ 


A 


G 


A 


G 


A 


A 


A 


G 


A 


A 


G 


T 


T 


A 


A 


A 


G 


T 


G 


A 


T 


A 


T 


T 


1156 


_ 


T 


A 


A 


T 


A 


G 


A 


G 


A 


A 


G 


A 


G 


A 


T 


G 


T 


A 


T 


G 


A 


A 


T 


A 


1157 




T 


T 


A 


T 


T 


A 


G 


T 


G 


A 


T 


A 


A 


G 


T 


G 


A 


A 


G 


T 


T 


T 


A 


G 


1158 


_ 


A 


T 


A 


A 


T 


G 


T 


A 


A 


A 


G 


A 


T 


G 


A 


G 


T 


T 


T 


A 


T 


G 


A 


G 


1159 




T 


T 


G 


A 


T 


T 


T 


G 


A 


G 


A 


G 


T 


T 


G 


A 


T 


A 


A 


G 


A 


T 


-T 


T - 


1160 




A 


T 


G 


A 


T 


T 


A 


T 


T 


G 


T 


G 


T 


G 


T 


A 


G 


A 


A 


T 


T 


A 


G 


A 


' 1161 




T 


A 


T 


A 


A 


A 


G 


A 


T 


A 


T 


A 


G 


T 


A 


G. 


A 


T 


G 


A 


T 


G 


T 


G 


1162 ' '.. 




T 


T 


T 


A 


G 


T 


T 


-G 


A 


G 


A 


T 


G 


A 


A 


G 


T 


T 


A 


T 


T 


A 


G 


A 


1163 




A 


T 


T 


G 


A 


A 


T 


T 


G 


A 


T 


A 


T 


A 


G 


T 


G 


T 


A 


A 


A 


G 


T 


G . 


1164 




G 


A 


A 


G 


A 


A 


A 


G 


A 


T 


T 


A 


T 


T 


G 


T 


A 


T 


T 


G 


A 


G 


T 


T 


1165 




A 


T 


T 


G 


A 


G 


T 


G 


T 


A 


G 


T 


G 


A 


T 


T 


T 


A 


G 


A 


A 


A 


T 


A 


1166 




A 


A 


T 


A 


A 


A 


G 


T 


G 


T 


T 


T 


A 


A 


G 


A 


G 


T 


A 


G 


A 


G 


T 


A 


1167 




G 


T 


A 


G 


A 


G 


A 


T 


A 


A 


T 


T 


G 


A 


T 


G 


T 


G 


T 


A 


A 


T 


T 


T 


1168 





